Wiadystaw Homenda 


Formal languages and automata 


Lecture notes 
Warszawa, 2010 


Faculty of Mathematics and Information Science 
Warsaw University of Technology 


® 
— UNIVERSITY OF TECHNOLOGY 
C) DEVELOPMENT PROGRAMME 


HUMAN CAPITAL EUROPEAN UNION 


EUROPEAN 
NATIONAL COHESION STRATEGY SOCIAL FUND 


Contents 


Content. 2.53300 bakes chee ac eh Ae Ake Soha sina Bink leslie f v 
Acronyms oroo Pehl ee Set E ix 
1 Regular expressions and regular grammars ....................... 1 
1.1 Regular expressions and regular languages ..................4.. 1 
1.1.1 Regular expressions ............ 0... cece eee eee eee 1 

1.1.2 Regular languages .......... 0... eee eee eee eee 2 

1.1.3 The Myhill-Nerode lemma ........................00.. 3 

1.1.4 The pumping lemma ................ 0... eee ee eee eee 3 

1.1.5 Regular grammars .......... 0.0 e eee eee eens 4 

2 Context-free grammars aeiee nena pa eee 7 
2.1 Context-free grammars - basics .............. 0.00. e cece ee eee 7 

2.2 Simplification of context-free grammars ....................0.. 8 
2.2.1 Useless symbols ......... 0.0... 0. cece eee ee eee 9 

2.2.2 Nullable symbols and €-productions .................... 10 

2.2.3. Unit productions ......... 0.0... cee eee eee 12 

2.3 Normal forms of context-free grammars.....................0-- 13 
2.3.1 Chomsky normal form................. 0.2 eee ee 14 

2.3.2 Greibach normal form ................. 0... e eee eee 16 

2.4 Pumping and Ogden lemmas .................. 0.0.0.0 0s eee eee 18 
2.4.1 The pumping lemma .......................0..0 00 eee 18 

2.4.2 The Ogden lemma ................ 000. 21 

2.5 Membership of context-free languages ......................04. 22 

2:6: Applications: setini reann epea Ri Radel taeda EE 25 
2.6.1 Translation grammars ............ 0.00. ee ee 25 

2202 TC) StAaMMArs “5.85 6 shes Sahat eases i sotalene ANE T ae 26 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
Contents 


vi 
3 Context-sensitive and unrestricted grammars ..................... 27 
3.1 Context-sensitive grammars ......... 0.0... eee eee eee 27 
3.2 Unrestricted grammars .......... 0... eee eee eee ee 30 
4 Turing machines ............... eee eee 31 
4.1 Deterministic Turing machines .....................000..00004 31 
4.1.1 Basic model of Turing machines ....................... 32 
4.1.2 Turing machine with the stop property .................. 36 
4.1.3 Simplifying the stop condition .....................0.0, 37 
4.1.4 Guarding the tape beginning .......................00.. 38 
4.1.5 Turing machines with a multi-track tape ................. 40 
4.1.6 Turing machines with two-way infinite tape .............. 41 
4.1.7 Multi-tape Turing machines ............. 0.0... e eee eens 44 
4.2 Nondeterministic Turing machines .............. 2... e eee ee eee 49 
4.3 Linear bounded automata ........ 0.00.00. ccc ccc cence 52 
5 Pushdown automata............... 0... ee eens 55 
5.1 Nondeterministic pushdown automata ......................00. 55 
5.2 Deterministic pushdown automata ..................00 00 eee 59 
5.3 Accepting states versus empty stack ....................00.000, 60 
5.4 Pushdown automata as Turing machines .....................-. 62 
6 Finite automata ...... 00. eens 65 
6.1 Deterministic finite automata................ 0. eee eee ee 65 
6.2 Nondeterministic finite automata....................... 000000. 69 
6.3 Finite automata with €-moves ............. 0.0... eee eee eee 74 
6.4 Finite automata as Turing machines ....................00.000. 79 
7 Grammars versus automata ........... susue uuee eee eee eee 81 
7.1 Regular expressions, regular grammars and finite automata........ 81 
7.1.1 Regular expressions versus finite automata ............... 81 
7.1.2 Regular grammars versus finite automata ................ 86 
7.1.3 The pumping lemma .......seuennseunnernnnrrrr neer 88 
7.1.4 The Myhill-Nerode Theorem ......................00.. 90 
7.1.5 Minimization of deterministic finite automata ............ 91 
7.2 More grammars and automata ............... 0.00. e cece eee eee 92 
7.2.1 Context-free grammars versus pushdown automata ........ 92 
8 ‘The: hierarchy’ 030.0055 sah aesGo5s EAEE E avi dimes donde E ER 95 
8.1 More operations on languages ................. 2 cee eee ee eee ee 95 
8.1.1 Substitutions, homomorphisms................02..00 ee 95 
8.1.2 “Quotients reir ee on heb ieuwdd oe hehe chats 97 
8.1.3 Automata building with quotients....................... 97 
8.2 The hierarchy of languages .............. 00... 98 


83- ‘CIOSCNESS: ecccecece ssa ee acece dee: 4. gl dcecatieg enevasareecg ash eoesere wig ig EEA ES 104 


EUROPEAN UNION 
EUROPEAN 
SOCIAL FUND Contents 


Acronyms 


€é-NFA 
ALL 
CFG 
CFL 
CSG 
CSL 
DFA 
DPDA 
DTM 
LBA 
NDFA 
NPDA 
NTM 
PDA 
REL 
RgL 
RkL 
TM 


finite automata with €-moves (the class of) 

all languages (the class of) 

context-free grammars (the class of) 
context-free languages (the class of) 
context-sensitive grammars (the class of) 
context-sensitive languages (the class of) 
deterministic finite automata (the class of) 
deterministic pushdown automata (the class of) 
deterministic Turing machines (the class of) 
linear bounded automata (the class of) 
nondeterministic finite automata (the class of) 
nondeterministic pushdown automata (the class of) 
nondeterministic Turing machines (the class of) 
pushdown automata (the class of) 

recursively enumerable languages (the class of) 
regular languages (the class of) 

recursive languages (the class of) 

Turing machines (the class of) 


Chapter 1 
Regular expressions and regular grammars 


Regular languages form the simplest class of formal languages. They are inducti- 
vely defined by regular expressions. Regular languages also outline simple algebraic 
structures in the set of all words over a given alphabet. Due to this simplicity, they 
can be easily distinguished and identified. Regular languages are also generated by 
regular grammars and - what will be discussed in Chapter 6 - they are accepted by 
finite automata. 

In this Chapter, fundamental properties of regular expressions and regular lan- 
guages are studied. We cover a discussion on the following topics: operation on re- 
gular expressions, algebraic properties of the relation induced by regular languages 
(so called Myhill-Nerode lemma, which is a part a part of Myhill-Nerode theorem), 
structure of words of regular languages (pumping lemma). The study is supple- 
mented by a section focused on regular grammars. All these topics are revisited in 
Chapter 7. 


1.1 Regular expressions and regular languages 


1.1.1 Regular expressions 


The definition of regular expression is inductive, i.e. basic regular expressions are 
defined explicitly while more complex regular expressions could be constructed ac- 
cording to given rules. 


Definition 1.1. Regular expression over an alphabet X is a construct defined as fol- 
lows: 


@ is a regular expression, 

€ is a regular expression, 

for each a in X, a is a regular expression, 
if r and s are regular expressions then 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


2 1 Regular expressions and regular grammars q 


— (r+s), the sum of regular expressions, 
— (rs), the concatenation of regular expressions and 
— (r)*, the Kleene closure of the regular expression 


are regular expressions. 


Definition 1.2. Regular expressions over an alphabet £ generate languages: 


® generates the empty language 0: 

€ generates the language {e}, 

for each a in X, a generates the language {a}, 

if regular expressions r and s generate languages R and S then 


— (r+s) generates the language RUS (union of languages R and S), 
— (rs) generates the language (RoS) (concatenation of languages R and S), 
— (r)* generates the language R* (Kleene closure of the language R). 


Remark 1.1. Regular expressions are finite constructs. In fact, they are words over 
the alphabet £ U {+,0,*,(,)}, where X is an alphabet of a regular expression. On 
the other hand, languages generated by regular expressions can be infinite. 


Remark 1.2. Regular expressions and formulas describing languages generated by 
regular expressions include a large number of brackets, what makes them hardly 
readable. On the other hand, algebraic operators, logic operators as well as set the- 
oretic operators are assumed to have priorities. This assumption allows dropping 
most of brackets of algebraic, logic as well as set theoretic expressions. By analogy, 
the same simplification is assumed for regular expressions. It is assumed that the 
sum operator has the lowest priority, concatenation has higher priority and the Kle- 
ene operator has the highest priority. The same assumption is adapted for operators 
on languages. Union has the lowest priority, concatenation has higher priority and 
the Kleene closure is of the highest priority. We will drop any pair of brackets if it 
does not change the order of operators when priorities are applied. 


Remark 1.3. Regular expressions can be interpreted as strings of symbols. There- 
fore, a simplified form of a regular expression is not equal to its original form 
in terms of strings’ equality. Yet, languages generated by both forms of a regular 
expression are identical. So then, both forms are considered to be equivalent. From 
now on, if not stated otherwise, equivalent regular expressions will be considered to 
be equal. 


1.1.2 Regular languages 


Definition 1.3. Regular languages are those and only those generated by regular 
expressions. 


EUROPEAN UNION 
EUROPEAN 


PEND E 1.1 Regular expressions and regular languages 3 


Remark 1.4. The following equalities hold (in terms of Remark 1.3). 


1.0+r=r+0=r 
2.0r=r0=0 
3.€r=ré€=r 
4.€+r=r+e 
5.r+s=s+r 

6. (r+s)+t=r+(s+t)=r+s+t 
7. (rs)t = r(st) = rst 
8. r(s+t)=rs+rt 

9. (r+s)t =rt+st 
10. (r*)* = r* 

11. (r*s*)* = (r+s)* 
12. (r* +5*)* = (r+s)* 


1.1.3 The Myhill-Nerode lemma 


The Myhill-Nerode theorem is the most important tool characterizing regular langu- 
ages. The theorem is formulated and proved in Chapter 7. In this Chapter a limited 
version of the Myhill-Nerode theorem is formulated, which will be referred to as the 
Myhill-Nerode lemma. The proof of the lemma is a direct consequence of a proof of 
the Myhill-Nerode theorem. The lemma is not proved here, since important topics 
used in the proof have not been discussed yet. The lemma is formulated here since 
it is a crucial tool used in the identification of regular languages. It will be used in 
this and other Chapters. 


Lemma 1.1. (the Myhill-Nerode lemma) A language L is regular if and only if the 
relation Ry, induced by the language L has a finite number of equivalence classes. 


1.1.4 The pumping lemma 


The pumping lemma is another tool - in addition to regular expressions and the 
Myhill-Nerode lemma - that can be used to characterize regular languages. Regu- 
lar expressions and the Myhill-Nerode lemma (and regular grammars, which are 
discussed in the next section) are aimed at proving that a language is regular. The 
Myhill-Nerode lemma and the pumping lemma can be used in proving that a lan- 
guage is not regular. As in case of the Myhill-Nerode lemma, the pumping lemma 
is not proved here, since important topics used in the proof have not been discussed 
yet. The lemma is formulated here since it is a crucial tool used in the identification 
of regular languages. It will be proved in Chapter 7. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


4 1 Regular expressions and regular grammars q 


Lemma 1.2. (the pumping lemma for regular languages) 
If a language L is regular then there exists a constant nz such that for any word the 
following condition holds: 


(zl >n) => (V z=uw ^ w| <n Abl) A gz=wwel 


u,v,w i=0,1,2,... 


The pumping lemma formulates conditions necessary for a regular language. It 
shows that nature of regular languages is finite and length of words of a given regular 
language is limited by some constant nz determined by the pumping lemma. If a 
regular language includes a word of length greater that or equal to nz, then it is an 
infinite language. However, a structure of words that are longer than or equal to nz is 
fairly simple. Such words are generated by inserting strings of length limited by the 
constant nz into words of the language that are shorter than nz. In other words, any 
word of a regular language, not shorter than nz, have a floating part to be deleted 
leaving the remaining part in the language. 

Since the pumping lemma formulates necessary conditions, in its direct form it 
is of limited importance. It can be used for analysis of a structure of words of the 
language. If words satisfy conclusion of the pumping lemma, then the language 
could be intuitively presumed to be regular. Then, based on such a supposition, 
the language could be proved to be regular. In practice, we use contraposition of 
the pumping lemma rather than its generic version. Contraposition of the pumping 
lemma formulates sufficient conditions for a language not to be regular. This makes 
contraposition of the pumping lemma to be extremely useful in proving that certain 
languages are not regular. 


Lemma 1.3. If for any constant n; there exists a word z € L such that 


(l>a A A z= ww A w| <n Ahl) V z=uwg L] 


u,v,w i=0,1,2,... 


then a language L is not regular. 


1.1.5 Regular grammars 


Regular grammars constitute the simplest class of grammars. They generate regular 
languages, so then they can be used for proving that languages are regular. A proof 
that regular grammars generate regular languages will be provided in Chapter 7. In 
this section, we provide a definition of regular grammars and use it as a tool for 
processing regular languages. 


Definition 1.4. A grammar G = (V, T, P, S) is: 


EUROPEAN UNION 
EUROPEAN 


END Ea 1.1 Regular expressions and regular languages 5 


e left-linear if and only if all its productions of the form A —> Bw or A — w, where 
A, B are nonterminals, i.e. A, B € V, w is a string of terminals, possibly empty, 
i.e. w € T*, 

e right-linear if and only if all its productions of the form a + wB or A — w, where 
A B are nonterminals, w is a string of terminals, possibly empty,item regular if 
and only if it is either left-linear, or right linear. 


Remark 1.5. Every left-linear grammar has equivalent right-linear grammar and 
every right linear grammar has equivalent left-linear grammar. Equivalence of gram- 
mars is understood as identity of generated languages. This observation will be pro- 
ved in Chapter 7. 


Chapter 2 
Context-free grammars 


The class of context-free languages is a next simplest class of languages, beside 
the class of regular languages. Context-free languages are generated by context-free 
grammars, which correspond to regular grammars and regular expressions. Context- 
free grammars are one of a few most important tools used for analysis of context-free 
languages. They are incomparably simpler than context-sensitive and unrestricted 
grammars. On the other hand, they define a wider class of languages than simpler re- 
gular grammars. The structure of words they generate is rich. Moreover, context-free 
grammars are well elaborated. They provide effective methods of analysis and gene- 
ration of languages. Furthermore, there are effective methods of automatic analysis 
and processing of context-free grammars. Due to these advantages they are widely 
applied in practice, e.g. in natural language processing, processing of programming 
languages, translation of formal languages, pattern recognition, etc. 

Algebraic structures of context-free languages created in the set of all words 
over an alphabet are much more complex than those created by regular languages. 
For this reason algebraic analysis of context-free languages is limited. For instance, 
there is no tool corresponding to the Myhill-Nerode theorem. 

This Chapter is devoted to a discussion of basic properties of context-free lan- 
guages, especially those properties, which arise out of the analysis of context-free 
grammars. Let us emphasize that an analysis of context-free grammars is well esta- 
blished given their practical relevance. 


2.1 Context-free grammars - basics 


Context-free grammars have a simple form of productions: a nonterminal creates 
production’s left hand side while a sequence of terminals and nonterminals figu- 
res its right hand side. This form of productions allows for a simple illustration of 
derivation. A derivation of a word can be demonstrated as a tree, what easies the 
proofs of such important properties as the pumping lemma, the Ogden lemma and a 
decision algorithm for context-free-languages. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


8 2 Context-free grammars q 


Definition 2.1. A grammar G = (V, T, P, S) is a context-free grammar if and only if 
left side of any its production is a nonterminal, i.e. any p € P is of a form A > @, 
where A € V and & € (VUT)*. 


Definition 2.2. A language L C 2* is context-free if and only if it is generated by a 
context-free grammar. 


For context-free grammar, a derivation of a word can also be presented in the 
form of a tree. 


Definition 2.3. A derivation tree in a context-free grammar G = (V, T, P, S) is a tree 
T = (w,eC W x W) satisfying the following properties: 


1. it is compatible with inductive definition of tree, but the set of children of every 
vertex is ordered, 

2. every vertex w € W of the tree is labelled with a nonterminal symbol or terminal 
symbol or empty word: ve VUTN{E}, 

3. the root of the tree is labelled with the initial symbols S of the grammar, 

4. internal vertices of the tree (i.e. vertices different than leaves) are labelled with 
nonterminals, 

5. if anonterminal A labels an internal vertex and children of this vertex are labelled 
with symbols (terminals, nonterminals or the empty word) X; X2...X, in this 
order, then there exists a production A — X1 X2...Xzx, 

6. a vertex labelled with the empty word is the only child of its parent. 


We say that a word is ambiguous if it has more than a single derivation tree. 
Equivalently, we can say that a word is ambiguous if and only if it has more than 
one leftmost derivation, or if it has more than one rightmost derivation. A context- 
free grammar is said to be ambiguous if and only if there is an ambiguous word. A 
context-free language is inherently ambiguous if and only if its every context-free 
grammar is ambiguous. 


2.2 Simplification of context-free grammars 


The definition of context-free grammars does not include optimization mechanisms. 
In this section we discuss some methods of simplification of context free grammars. 
For instance, a context-free grammar including symbols and productions, which are 
never used in derivation of any word, may be transformed to a form simpler to use. 
Also, a context-free grammar can be restructured to a form more suitable for a given 
application 


EUROPEAN UNION 
EUROPEAN 


RERNE Ea 2.2 Simplification of context-free grammars 9 


2.2.1 Useless symbols 


In particular, a context-free grammar can include symbols that are useless for ge- 
nerated language . A set of nonterminals which will never produce a sequence of 
terminals is the first type of useless symbols. A set of symbols (terminals or non- 
terminals) never used in derivation from initial symbol of the grammar forms the 
second type of useless symbols. Both types of useless symbols can be removed to- 
gether with productions including such symbols (such productions are invalid in 
terms of reduced sets of nonterminals and terminals). Both grammars, the initial 
one and the grammar with removed useless symbols and removed invalid produc- 
tions, generate the same language. This observation is obvious since a production, 
which include useless symbols can never be used in a derivation of any word over 
the terminal alphabet. 


Proposition 2.1. For any context-free grammar G=(V,T,P,S) generating 
a nonempty language L(G) there exists an equivalent (i.e., generating the same lan- 
guage) context-free grammar G' = (V', T, P’, S) such that every nonterminal A € V' 
generates a sequence of terminals (possibly empty). Note that an initial symbol is 
useless if a context-free grammar generates the empty language. The following al- 
gorithm shows how to remove useless symbols of the first type: 


begin 

Vold : =Vnew : =0 

for each production p: A> Œ, pEP do 

if @ET* then Vnew:=Vnew U{A} 

{start with nonterminals producing a string of terminals} 

while Voig FV new do 

begin 

V £ old ?=Vnew 

for each production p: A> a, peEP do 

if ME (TUVga)* then Vnew:=Vnew {A} 

end 

V':=Vnew {new set of nonterminals} 

P':={A>QE P : A€ V’, supp(a)C(V’ UT) }¥} 

{nonterminals of productions must be included into new set of nonterminals} 

end 


Proposition 2.2. For any context-free grammar G=(V,T,P,S) generating 
a nonempty language L(G) there exists an equivalent (i.e., generating the same 
language) context-free grammar G” = (V", T", P", S) such that every nonterminal 
symbol A € V" and every terminal symbol a € T" could be derived from the ini- 
tial symbol of the grammar. The following algorithm allows for removing useless 
symbols of the second type: 


begin 
Void? =Vnew : =0 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
10 2 Context-free grammars 


Vnew?={S},  Thew? =O 

while Void #Vnew Or Told FTnew” do” 

begin 

Viold?=Vnewr Told? =Tnew 

for A€Voig do 

begin 
Vnew? =Vnew U {BEV : exists A> Q and BEsupp (Q) 
Trew :=TnhewU {aET : exists A> Q and a€supp (Q) 

end 

end 

V'':=Vnew {new set of nonterminals} 

T’'’:=Tnew {new set of terminals} 

P'':={A>QE P : A€ V’, supp(a)C(V’’UT’’) }} 

{nonterminals of productions must be included into new set of nonterminals} 
end 


} 
} 


2.2.2 Nullable symbols and €-productions 


Elimination of €-productions and nullable symbols is the next step of simplifica- 
tion of context-free grammars. €-production is a production of the form A — € and 
nullable symbol is a nonterminal symbol producing the empty word A —* €. Of 
course, if the empty word is derivable in a context-free grammar G, then it is not 
possible to eliminate all €-production and nullable symbols. However, removing £- 
productions and nullable symbols from the grammar G turns it to the grammar ge- 
nerating the language L(G) — {€}. Therefore the process of elimination of nullable 
symbols and €-production will be applied to context-free grammars not producing 
the empty word. 

The following algorithm finds nullable symbols in a context-free grammar G = 
(V, T, P, S): 


begin 
Vola :=0 {begin with no nullable symbols } 
Vnew:={AEV : AE is a production} 
{add all nonterminals producing directly the empty word} 
while Void #Vnew do 
begin 
V £ old *=Vnew 
Vnew? =Vnew U {AEV : exists A> Q, where @ €V} 
{add all nonterminals producing directly a word over nullable symbols } 
end 
end 


EUROPEAN UNION 
EUROPEAN 


saree te Ea 2.2 Simplification of context-free grammars 11 


Having the set of nullable symbols we will be able to remove €-productions. Ob- 
serve that nullable symbol in a production (its right hand side) either can generate a 
string of terminal symbols, or can be turned to the empty word. As a consequence, a 
nullable symbol can either be left in the right side of a production (when it produces 
a nonempty sequence of terminal symbols), or it can be dropped from the right hand 
side of a production (when it generates the empty word). This observation leads to 
the following method. 


Proposition 2.3. Let G = (V,T, P, S) is a context-free grammar with no useless 
symbols generating a language without the empty word. If A — XX ...Xy is a 
production, then this production is replaced with a set of all productions of a form 
A > Qı Q2...Q, satisfying conditions: 


e 0; =X; if X; is not nullable (i.e., it is a terminal symbol or a not nullable nonter- 
minal symbol), for alli=1,2,...,n, 

e a; =Xjor a; = €, if X; is nullable, for alli=1,2,...,n, (i.e. we get two produc- 
tions, one with X; at the right hand side and another one without X;), 

e not all œi, Q2,..., On are equal to the empty word (this condition eliminates €- 
productions). Of course, the existing €-productions are removed. 


Notice that this method turns status of nullable nonterminals to not nullable 
ones rather then removing them from the grammar. Finally, we obtain a grammar 
G' = (V, T, P’, S) without nullable symbols and €-productions. This grammar has a 
modified set of productions and is equivalent to the grammar G, i.e., generates the 
same language. 


Proof. Proof of equivalence of both grammars is based on equivalence of deriva- 
tions in both grammars G and G’: 


e both grammars have the same sets of terminals and nonterminals, 
e aderivation in the grammar G of any word w € L(G) can be turned to a derivation 
of the same word in the grammar G’: 


— if a derivation does not employs €-productions, then this is also a derivation 
in the grammar G’, 

— if an €-production X — € is applied in a part of derivation with a production 
Y > aX B utilized prior to this X —> € €-production: 


... —> YY ô > ya X Bb — yap ô >... 


then this part can not be included in any derivation in G’, but it can be turned 
to a fragment of a derivation in G’ shown below. Here the production Y + aB 
with the nullable symbol X dropped is employed, 


>» YY ô — yap >... 


Finally, if we apply analogous replacement for every €-production, the deri- 
vation of the word w € L(G) in the grammar G is turned into a derivation of 
the same word in the grammar G’, 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


12 2 Context-free grammars q 


e a derivation in the grammar G” of any word w € L(G’) can be turned to a deriva- 
tion of the same word in the grammar G: 


— if a derivation utilizes only productions of the grammar G, then it is a valid 
derivation of the word w in G, 

— otherwise, a derivation employs at least one production of a form Y > a@B of 
the grammar G’ gotten from a production of a form Y — a’ X B’. Then a part 
of derivation 


>» YY ô — yap >... 


can be replaced by a fragment of a derivation in a grammar G with inserted a 
series of €-productions: 


3 YY 8 > ya'X B'S >... yab >... 


where X —> € is an €-productions in the grammar G and string @ and B are 
gotten by applying the same scheme to all nullable symbols of both strings. 
This method applied to all productions of the derivation of the word w € L(G’) 
in the grammar G’ turns this derivation to a derivation of the same word in the 
grammar G. 


Finally, comparing languages generated by a context-free grammar G and by its 
transformed form G” without nullable symbols and €-productions, we can state that 
L(G’) = L(G) — {£}. 


2.2.3 Unit productions 


A context-free grammar G = (V, T, P, S) may have unit productions. Unit produc- 
tions are of a form A — B, where A, B € V. Unit productions are confusing and 
do not provide any new abilities for language generation. Elimination of unit pro- 
ductions is the next step of grammar simplification. The method of removing unit 
productions is concerned with substituting a unit production A —> B with a series 
of productions A — q; for all B-productions B — a;. The method is outlined in the 
form of the following algorithm: 


begin 

while there exists a unit production A+B do 
begin 

if (A=B) then remove the production 

else 

replace the production A+B with 

productions A>% |... 1% 

where Yi, -..,}% are right hand side 

of all B-productions 

end 


EUROPEAN UNION 
EUROPEAN 


meee Ea 2.3 Normal forms of context-free grammars 13 


end 


The new grammar G” = (V, T, P’”, S) produced by this algorithm may have use- 
less nonterminal symbols, which need to be removed. For instance, the grammar 


G = ({S,A, B}, {a}, {S > Ala, A > B, Ba}, S) 


is turned to the grammar without unit production, but with useless nonterminal sym- 
bols A and B: 


G” = ({S, A, B}, {a}, {S — a, A — a, B — a}, S) 


Note that elimination of the unit production S — A introduces the production S —> 
a, which already exists in the grammar. The process of removing useless symbols 
yields the following grammar: 


G* = ({S}, {a}, {S > a}, S) 


A grammar without unit productions is equivalent to the former one. To prove 
this, let us consider a derivation tree in the former grammar. If a part of derivation 
tree matching a derivation A;, + A;, > --- + Aj, — @ (the last production is not 
unit, the vertex A;, has at least two children or one leaf) includes a repeating nonter- 
minal in the path A; > ---A;, > A’ > --- > A’ + Ai, > ---Aj,, then this path can 
be shortened to Aj, + ---Aij, >A’ > Aj, > Q. 

Let assume that a derivation A; + A;  --- — Aj, > @ does not have a repeating 
nonterminal. The method of elimination of unit productions provides a production 
Ai, — @, so then this derivation can be cut to A; —> @ which creates a part of a 
derivation tree in the grammar G’”. And vice versa, if we have a part of derivation 
tree in the grammar G” matching a production A;, — œ in the grammar G”, then 
this production either belongs to the grammar G or it can be turned to a derivation 
Ai, + Ai, > ++: 4 Aj, > & in the grammar G”. 

In conclusion, a derivation tree in the grammar G can be turned to a derivation 
tree in the grammar G’” by replacing all such transformations. And vice versa, a 
derivation tree in the grammar G” can be turned to a derivation tree in the grammar 
G. 


ip 


2.3 Normal forms of context-free grammars 


We discuss two normal forms of context-free grammars, that is Chomsky normal 
form and Greibach normal form. Conversions of context-free grammars to normal 
forms come as a further step in the simplification of grammars. Normal forms are 
grammars with restrictions put on the form of productions. Grammars in normal 
forms produce languages without the empty word, so then only grammars not gene- 
rating the empty word could be transformed to normal forms. This is why a context- 
free grammar, in which the empty word is derivable, should be turned to a form 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


14 2 Context-free grammars q 


generating the same language without the empty word. Since both normal forms 
do not admit €-productions, a removal of €-productions and nullable symbols will 
convert any context-free grammar to a form generating the language without the 
empty word. Normal forms do not necessarily require removal of useless symbols. 
Anyway, it is recommended to simplify a grammar by removing useless symbols 
first. 


2.3.1 Chomsky normal form 


Definition 2.4. A context-free grammar G = (V, T, P, S) isin Chomsky normal form 
if and only if its productions are of the form A — BC or A — a, where A, B,C € 
V,a€ T (i.e. any production turns a nonterminal to two nonterminals or to one 
terminal). 


Proposition 2.4. Any context-free grammar without €-productions and unit produc- 
tions can be transformed to Chomsky normal form. 


Proof. let assume that a context-free grammar G = (V, T, P, S) does not have £- 
productions and unit productions, so then a right-hand side of any production is a 
terminal symbol or is a string of at least two symbols. Productions with one terminal 
symbol on right hand side are in Chomsky normal form. Such productions will not 
be changed. 

Productions having at least two symbols on the right hand side will be transfor- 
med to a set of productions according to rules: 


1. every production of a form A — 0 a Q) is substituted with two productions A > 
0, A’ Q and A —> a, where: A € V, a € T, OQ € (V UT)" and A’ is a new 
nonterminal, 

2. every production of a form A —> A; A2...An, n > 2 is substituted with two pro- 
ductions A + A1 2... An and A1 2 —> A; A2, where A, Aj, A2,...,An E V, Ai2 isa 
new nonterminal. 


The new grammar G’ = (V’, T, P', S) includes newly added nonterminals. All 
productions of the grammar G not in Chomsky form are replaced with sets of new 
productions in Chomsky form. 

Both grammars G and G’ are equivalent, i.e., they generate the same language. 
To show this, let us consider a derivation tree of a word in grammars G and G’: 


e a local fragment of a derivation tree in the grammar G matching a production 
of a form A — a œ is shown in part (i) of Figure 2.1. This fragment can be 
replaced with a fragment matching two productions A —> @ A‘ œ2 and A —> a 
(both substitute the former production) as shown in part (ii) of Figure 2.1. The 
new tree generates the same crop (although it may not be a valid tree in any 
of these two grammars). On the other hand, a production A > a, A’ Q of the 
grammar G’ forces the production A’ — a since the nonterminal symbol A’ is 


EUROPEAN UNION 
EUROPEAN 
SOCIAL FUND 


Ea 2.3 Normal forms of context-free grammars 


unique in the grammar G’ and appears only in former two productions. Both 
former productions correspond to a part of a derivation tree in the grammar G’ 
shown in part (ii) of Figure 2.1. This part can be turned to a structure shown 
in part (i) of Figure 2.1, which corresponds to a production A —> | aQ2 of the 


grammar G, 


e parts of derivation trees corresponding to a production of a form A — A; A2...Ay, 
and to its substitutions A + A; 7...A, and A1 2 — A1 A2 is shown in parts (iii) 


and (iv) of Figure 2.1, 


e finally, 


— substituting all fragments of a derivation tree shown in parts (i) and (iii) of 
Figure 2.1 turns a derivation tree in the grammar G to a derivation tree in the 
grammar G’, 

— opposite substitutions turns a derivation tree in the grammar G’ to a derivation 
tree in the grammar G, 


— substitutions does not change the crop of subjected trees, 


what justifies equivalence of grammars G and C’. 


Fig. 2.1 Equivalence of a context-free grammar and its Chomsky normal form. 


(i) 
A 
Ql, a 
(iii) 
A 
A; A, 


Oy 


n 


(ii) 


(iv) 


A 


Ajo 


Ay 


Q2 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


16 2 Context-free grammars q 


2.3.2 Greibach normal form 


Definition 2.5. A context-free grammar G = (V, T, P, S) is in Greibach normal form 
if and only if its productions are of the form A + aœ where A € V,a €T, œ € V* 
(i.e. any production turns a nonterminal to a terminal and a string (possibly empty) 
of nonterminals. 


Proposition 2.5. Any context-free grammar without €-productions and unit produc- 
tions could be transformed to Greibach normal form. 


Proof. The following method can be used for transformation of a context-free gram- 
mar G = (V, T, P, S) to Greibach normal form: 


1. transform the grammar to Chomsky normal form Gc = (Vc, T, Pc, S), 
2. enumerate nonterminal symbols in Vc = {A}, A2,..., An}, 
3. modify the grammar Gc such that the right hand side of every A;-production 
begins with a terminal symbol or with a nonterminal symbol of a higher index: 
(x) P ae Or where a € T œ, B E€ Ve 
Aj —> a B : , 
Let assume that for i = 1,2,3,...,k—1 all A; productions satisfy the condition 
(*) given above and that some A, production do not satisfy this condition. If a 
production A; — A j œ, where Ag, A; € Vc and & € VŽ, does not satisfy (*), then 
either (i) k > j or (ii) k = j, 
(i) for every A ;-production replace the production A, — Aj; œ with the production 
A; — BO, where p; is the right-hand side of the A ;-production. The right-hand 
sides of new productions A, — ßı& either begins with a terminal or with a 
nonterminal A; with index greater than j. Observe that every such substitution 
increases the value of the index of the right hand side nonterminal. Repeating 
substitutions at most k — 1 times, we obtain all A; production with right-hand 
side begin with either a terminal symbol A% or with the nonterminal symbol 


or with a nonterminal symbol A; with / > k, so then the case (i) is eliminated, 
((ii) let that there is the following set of A,-productions: 


Ag > A, Q1 | Ak |... | Ak Op | Bi | Bo|...| By 


where: 0,..., @p E VŠ, Bi,..., Br € (TU {Agsi,..-An})oVé 

i.e. P; is a terminal or a nonterminal with index greater than k followed by a 
sequence (possibly empty) of nonterminals, i.e. all Ay-productions satisfy the 
condition (ii), 

This set of Ax-productions is replaced with the following set of new produc- 
tions: 


Ax — Bi | Bo |---| Br | Bi Bx | BoB. |---| -Bx 


By 4 | œ |... | Œp | OB | OB |... | &pBk 


EUROPEAN UNION 


EUROPEAN 


See pea 2.3 Normal forms of context-free grammars 17 


where: B;, is a new nonterminal. Nonterminals B; are ordered according to 
their indexes and are followed by all nonterminals Aj. 

Notice that all newly included productions satisfy the condition (*). There- 
fore, the condition (*) is satisfied for all A;-productions and B;-productions 
fori=1,2,...,k, 


4. the process of the recent point repeated for successive nonterminals guarantee 


satisfaction of the condition (*) for all productions. Moreover, the right hand side 
of every A,-production must begin with a terminal symbol since A, is the last 
nonterminal in the introduced order, i.e., every A,-production is in the Greibach 
normal form, 

. for backward values of k = n— 1,n— 2, ...,2, 1, for every Ay-production with 
right-hand side with a leading nonterminal symbol Aj, j € {n,n—1,...,k+ 
1,k+ 1}, replace this production with a new set of productions. Productions 
of this new set are obtained by replacing the leading nonterminal A; with right 
hand sides of A ;-productions. We do the same with B ;-productions for decreasing 
values of index j. Productions of the newly created set are in Greibach normal 
form because all A ;-productions and B ;-productions have already been turned to 
Greibach normal form, 

. as a result, we get a grammar Gg = (Vg, T, Pg, S) in Greibach normal form, 
where Vg is a set of nonterminal symbols Vc supplemented with nonterminal 
symbols B; created in point 3 (ii). 


It has already been noted that the conversion to Chomsky normal form does not 


change the language being generated. The transformation described in point 3 (i) 
also keeps the generated language without any changes - justification is the same as 
for conversion to Chomsky normal form and for elimination of unit productions. 


EF 
LX" OP, 
i" "SY, 


T 
Q 
= 


Fig. 2.2 Invariability of unlooping method of a context-free grammar. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


18 2 Context-free grammars q 


Now we justify that elimination of looped productions in point 3 (ii) does not 
change the language. Let assume that a part of a derivation tree in the grammar 
G employs several productions of a form A; + A;Q;, i.e. a nonterminal symbol A; 
is substituted by a string of nonterminal symbols A;q@; several times and finally A; 
is substituted by a string B. The following example concerning a fragment of a 
derivation tree in the grammar G shown in part a) of Figure 2.2 is considered. This 
fragment is equivalent to the following derivation G: 


Aj > Ai Qi, > Aji, Qi, —> Ai Qi Oi, Qi, > P Oj, Oi, Oi, 


The rules in point 3 (ii) of Proposition 2.5 employed to the above fragment of 
derivation tree produces a fragment of a derivation tree in the grammar Gg. This 
fragment is shown in Figure 2.2 (b). It is equivalent to the following derivation 
completed in the grammar Gg: 


A; > BB; > Ba;,B; > Ba;,0;,B; > B Oi, Oi, Oiz 


Notice that derivations presented here may be distorted by other productions. 
Nevertheless, the altered derivations are equivalent to the same derivation trees. 


2.4 Pumping and Ogden lemmas 


The pumping lemma for context-free languages and the Ogden lemma characterize 
the structure of words of context-free languages. Both lemmas are very important 
tools used in identification of context-free languages, like other tools in case of 
regular languages. 


2.4.1 The pumping lemma 


The pumping lemma formulates conditions necessary for a language to be context- 
free. It shows that the nature of context-free languages is finite and length of words 
of a given context-free language is limited by some constant nz, whose value is 
determined in the pumping lemma. If a context-free language includes a word of 
length greater than or equal to nz, then it is an infinite language. However, a struc- 
ture of words that are longer than or equal to nz, is fairly simple. Such words are 
generated by inserting strings of length limited by the constant nz into words of the 
language that are shorter than nz. We can also say that any word of a context-free 
language, not shorter than nz, have two floating parts that can be deleted simultane- 
ously leaving the remaining part in the language (let us recall that words of regular 
languages have only one part that can be subjected to deletion). 


EUROPEAN UNION [eI 
EUROPEAN 
SOCIAL FUND 2.4 Pumping and Ogden lemmas 19 


Lemma 2.1. the pumping lemma for context-free languages 

If a language is context-free 

then there exists a constant ny such that for any word z € L the following condition 
holds: 


(k| > nz) => K V z = uvwxy A |\vwx| < nz A |vx| > 1) N a =uviwely EL 


u,v,w i=0,1,2,..: 


Proof. Ifa language L is finite, then a constant nz greater than the length of a longest 
word of this language, satisfies the lemma. 

Consider the case of infinite languages. Let us assume that a context-free gram- 
mar G = (V, T, P, S) in Chomsky normal form generates a (context-free) language 
L. Derivation trees in such a grammar are binary trees. We use the property that he- 
ight (length of a longest path from the root to a leaf) in any binary tree with k leaves 
is not less than log, k. All vertexes of such a path, except the last one, are labelled 
by nonterminal symbols of the grammar. The last vertex of this path is a leaf of the 
tree and is labelled by a terminal symbol. 

Let |V| = N. If we set the constant n; = 2*!, then derivation tree of a word z 
not shorter than nz has height not less than N + 1. Therefore there exists a path from 
the root S to a leaf not shorter than N + 1. Consequently, N + 1 vertexes of this path 
are labelled by N nonterminal symbols and the leaf is labelled by a terminal symbol 
a, c.f. Figure 2.3. As a result, there exist two (maybe more) vertexes labelled by the 
same nonterminal symbol. Let us consider the pair of vertexes symbol closest to the 
leaf a that are labelled by the same nonterminal symbol A. To distinguish labels of 
vertexes of this pair, the nonterminal symbol A is denoted A’ and A” respectively. 
The crop z of the derivation tree is divided into five parts: u, v, w, x and y. w is the 
crop of the subtree with the root A” while vwx is the crop of the subtree with the 
root A’. 


u v w X y 


Fig. 2.3 The derivation tree of a word z of length 2I”|+!, 


If we replace the subtree with the root A’ by the subtree with the root A”, we will 
get a valid derivation tree with the crop zo = uwy = uv°wx"y. This tree is shown at 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


20 2 Context-free grammars q 


u y v w x 


Fig. 2.4 The derivation trees obtained from the tree shown in Figure 2.3. 


the left part of Figure 2.4. If we replace the subtree with the root A” by the subtree 
with the root A’, we will obtain the tree shown at the right part of Figure 2.4, which 
is still a valid derivation tree with the crop z2 = uvvwxxy = uv? wx’y. Replacing the 
subtree with the root A” in this last tree by the subtree with the root A’, we get a 
derivation tree with the crop z3 = uvvvwxxxy = uv3wxy. This iterative process can 
be continued. 

In this way, we show that the pumping lemma for context-free languages is satis- 
fied. To fulfill formal requirements, mathematical induction should be applied based 
on the number of replacements of the subtree with the root A” by the subtree with 
the root A’. The reader can elaborate details of an inductive proof. 


Since the pumping lemma formulates necessary conditions, it is of limited im- 
portance in its direct form. It can be employed to analyze a structure of words of the 
language. If words satisfy the conclusion of the pumping lemma, then the language 
could be intuitively presumed to be context-free. Then, based on such a supposition, 
the language could be proved to be context-free. In practice, we use a contraposition 
of the pumping lemma rather than its generic version, like in case of the pumping 
lemma for regular languages. The contraposition of the pumping lemma formulates 
sufficient conditions for a language not to be context-free. This makes the contrapo- 
sition of the pumping lemma to be extremely useful in proving that certain languages 
are not context-free. 


Remark 2.1. Let us note that the pumping lemma for regular languages is a special 
case of the pumping lemma for context-free languages. The assumption that uv = € 
turns the pumping lemma for context-free languages to the pumping lemma for 
regular languages. 


Lemma 2.2. If for any constant nz, there exists a word z € L such that 


EUROPEAN UNION re 
EUROPEAN 
SOCIAL FUND 2.4 Pumping and Ogden lemmas 21 


(|z| > nt) A K N z= uvwxy A |vwx| < nr A x| >21) V zi=u'wiygL 


u, v,w i=0,1,2.... 


then a language L is not context-free. 


2.4.2 The Ogden lemma 


The pumping lemma is a powerful tool used for proving that languages are not 
context-free. However, contraposition of the pumping lemma can be hardly applied 
for some types of languages. In such difficult cases the Ogden lemma may help in 
proving that a language is not context-free. In fact, the pumping lemma is a special 
case of the Ogden lemma. However, the pumping lemma is easier to be applied than 
the Ogden lemma. This is why the pumping lemma is used for simpler problems. 


Lemma 2.3. the Ogden lemma 

If a language L is context-free, 

then there exists a constant nz, such that for any word z € L and for at least nz 
symbols marked in z there exists a split z = uvwxy holding conditions: 


e vx includes at least one marked symbol, 
e vwx includes no more than ny marked symbols 


and such that zi = uv'wx'y € L for any i=0, 1, 2..... 


Proof. Let us notice that height of a derivation tree is - of course - not less than 
log, nz. This means that there exists a path from the root to a leaf not shorter than 
log, nz. Such a path starts in the root and then, for every its vertex, goes to this child, 
which has no less marked leaves in its subtree, than the other child has. Having such 
a path we can do the same replacements in the derivation tree as we did in the proof 
of the pumping lemma. 


Note that marking all symbols of a word we turn the Ogden lemma to the pum- 
ping lemma. Thus, the pumping lemma is a special case of the Ogden lemma. 

As in case of the pumping lemma, the Ogden lemma formulates necessary con- 
ditions for a language to be context-free. This is why the Ogden lemma in its direct 
form is hardly applicable. In practice we use the contraposition of the Ogden lemma. 


Lemma 2.4 (contraposition of the Ogden lemma). 


if for any constant ny there exists a word z € L with at least n; symbols marked 
such that for any split z = uvwxy holding conditions: 


e vx includes at least one marked symbol, 
eè vwx includes no more than nz marked symbols 


and such that there exists a constant i € {0,1,2,...} for which z; = uwv'wx'y ¢ L 
then the language is not context-free. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


22 2 Context-free grammars q 


2.5 Membership of context-free languages 


The central question is how to check if a word belongs to a language. Having a 
context-free grammar we can answer the question if a word is generated in this 
grammar. Moreover, we will be able to construct a derivation of a given word if it is 
generated in the grammar. 

First of all, let assume that a grammar is in Greibach normal form. Note that 
any production applied to a derivation adds a single terminal symbol. This means 
that any derivation of a word of length n has length n (i.e. productions are applied 
n-times in a derivation). A method of construction of a derivation is simple. Of 
course, we start with the initial symbols S of the grammar and apply a S-production 
with right hand part beginning with the first letter of the word. In the next steps, we 
take the leftmost nonterminal symbol A of an intermediate derivation word and the 
next consecutive letter a of the word and apply an A-production beginning with the 
terminal symbol a. 

If a grammar is simple, i.e., for every nonterminal symbol A and for every termi- 
nal symbol a there is at most one A-production with the right hand side beginning 
with a, then there is no ambiguity in the choice of productions. Otherwise, when the 
grammar is not simple, there is a question how to choose a production. We either 
can make a nondeterministic choice between all A-production with the right hand 
side starting with given terminal symbol, or can check if any possible derivation 
produces the given word w. In case of checking all possible derivations, assuming 
that we have no more than k productions for the choice, we may have up to k” de- 
rivations, where length of the word w is equal to n. This means that computational 
complexity of this method is exponential, what makes it useless in practice. Note: a 
concept of nondeterminism will be discussed in further parts of this book. 

There are more algorithms for membership test, many of them being applicable 
to special forms of context-free grammars. We discuss here an algorithm invented 
by J. Cocke, H. Younger and T. Kasami. The algorithm is called the Cocke- Younger- 
Kasami algorithm or the CYK algorithm for short. The CYK algorithm operates on 
a context-free grammar in Chomsky normal form. 

The way of determining whether a word w of length n is generated in a grammar 
G in Chomsky normal form is outlined as follows: 


1. split the word w into n substrings of length 1 (every letter of the word w makes 
up a substring, in this case) and find out nonterminal symbols generating every 
substring. This operation is a simple lookup for productions of the form A > a, 
where a is a given substring of length 1, 

2. having nonterminal symbols generating substrings of the word w not longer than 
k, we can find nonterminals generating strings of length k + 1: 


a. split a substring of length k + 1 to all possible pairs of substrings (i.e. prefixes 
of lengths 1, 2, 3,..., k and the corresponding suffixes of length k, k— 1, k— 
Baa 25 1), 

b. for every pair of a prefix and the corresponding suffix sets of nonterminal 
symbols generating them have already been determined, 


EUROPEAN UNION 
EUROPEAN 


ASE Ea 2.5 Membership of context-free languages 23 


c. find out the set of all nonterminals A such that there is a production A — BC, 
where B and C are nonterminals generating the prefix and the corresponding 
suffix, 

d. nonterminals found out in the point c generate the given string of length k+ 1 
and no other nonterminal does, 


3. finally, we get the set of all nonterminal symbols generating the substring of 
length n, i.e., generating the word w. The word w is generated in the grammar if 
and only if the initial symbol of the grammar is in this set. 


Assuming that w = a,a2...d, is an analyzed word and WA is the set of all nonter- 


minals generating the substring ajaj+)...aj+j;—1 of the word w, for j = 1, 2,...,n, 
i= 1,2,...,n— j+ 1, the Cocke-Younger-Kasami algorithm can be formulated as 
follows: 
begin 


1. find out all sets V! of nonterminal symbols 


generating the letter a; of the wordw 
2. for consecutive length values j=2,3,...,n 

of substrings of the word w do 
3. for consecutive substrings d...dx4(j-1) = 

=]... 4(j-1)42 +++ A24(j-1) +++) Gn—j41 An j41)+(j-1) do 
begin 
4. initialize sets V/ to the empty set 
Dis for splits of ay...dg,(j-1) to prefix ak...ak4i-j) 
and suffix aky---ak}(j-1); P= 1,2...7—1 do 

6x find out all productions A—BC 


s.t. Bey; and Cay, 


and include all such A’s into the set Vf 
end 
7. the word w is generated in the grammar 
if and only if the initial symbol 
of the grammar is included into the set Vř 


Let us analyze the computational complexity of the CYK algorithm. Observe that 
costs of the following operations are upper-bounded by constants: 


initialization of sets V; in operation 1, 

getting a substring in operation 3, 

initialization of sets Vi to empty sets in operation 4, 

splitting a string into a prefix and a suffix in operation 5, 

finding out productions in operation 6 (checking all production of the grammar, 
the number of production is fixed, so then cost of this operation is bounded by a 
constant), 

e including left hand sides A of productions into sets. 


Finding out productions in operation 6 is a dominant operation of this algorithm. 
The number of executions of this operation is equal to: 


g HUMAN CAPITAL 
24 2 Context-free grammars 
n n—j+l n 
YY G-)=LVe-s+G-l)= 
j=2 k=1 j=2 
n 2 n 
-E 7? +42)¥ j- (n-1)(n+1)= 
j=2 j=2 
a : n? „n 1)4 (n+2)(n+2)(n—1) a -n 
3 2 6 2 6 


what means that complexity of the CYK algorithm is of the range O (n?) with regard 
to length of an analyzed word. 

The basic version of the CYK algorithm is used to find out if a word is generated 
in the grammar, but it does not allow for finding a derivation tree. A modified version 
of the CYK algorithm, with the extended version of operation 6, gathers information 
necessary for building derivation trees: 


6. find out all productions ABC s.t. Bev, and Cevi] 


and include all such A’s into the set V/, 

store right hand side BCod the production A— BC 
and parameter l in the set A, attached to A 
(left hand side of the production) 


The above operation attaches right hand side of every A-production determined 
by the operation 6 to the nonterminal symbol A. It means that every nonterminal 
symbol A in every set Ve has some associated set A_, of right hand sides of A- 
production generating corresponding substring of the word w. The following algo- 
rithm generates derivations tree based on results of the extended CYK algorithm. 
Parameters of the function generate (generate a subtree of the derivation tree) 
denote: k - the position in the word of the first letter of the substring, j - length of 
the substring, A - the nonterminal generating the substring, here - the position of 
the nonterminal A in the tree. 
generate (k, j,A, here) 
begin 

if j=1 then 
begin 
put the symbol A at the place here, 
draw an edge from A down to the symbol 
end else 
begin 
newTree:=fals 
currently build tree is the current copy 
for every BC,IEA-, do 
begin 
if newTree then 
create a copy of the derivation tree 
built before the current call 


EUROPEAN UNION Ei 
EUROPEAN 
SOCIAL FUND 2.6 Applications 2 


of this function, make newly created copy 
to be the current copy, 
apply subsequent operations to the 
current copy 
put the symbol A at the place here 
draw a left edge to a leftVertex 
call generate(k,1,B, leftVertex) 
draw a right edge to a rightVertex 
call generate(k+1,j-1,C, rightVertex) 


newTr <= tru 


The first call of the function generate requests building the whole derivation 
tree for an investigated word w of length n and is as follows: 
generate (1,n,S,position-of-the-root). 


2.6 Applications 


This section is devoted to selected application of context-free grammars. The section 
is a roadside of a main discussion on formal languages, automata and computability. 
Topics included in this section are a small part of a compilers practice. They can 
be used in an elementary project on parsing basic constructions like arithmetical 
expressions, which are among the most complex parts of programming languages. 
This section is not aimed on complete and detailed presentation of parsing. It is 
rather a signalization of the theme. 


2.6.1 Translation grammars 


This section is focused on some modifications of context-free grammars. Despite 
that modifications presented here are not included into main flow of discussion on 
context-free grammars, the associated practical importance makes them valuable 
and justifies their presentation in the book. Two extensions of context-free grammars 
are presented, namely translation grammars and LL(1) grammars. These types of 
grammars will be used to parse arithmetic expressions and translate them to postfix 
form. 

Translation grammars stem from context-free grammars. They could be seen as 
context-free grammars with simultaneous derivation of related words. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


26 2 Context-free grammars q 


Definition 2.6. Translation grammar is a context-free grammar G = (V,T, P, S) 
with the set T of terminal symbols split into two disjoint subsets T” and T” of 
primary and secondary symbols, i.e. T = T'UT”, T'NT” =0. 


A translation grammar is a context-free-grammar producing a context-free lan- 
guage L(G) over the alphabet T of terminal symbols. On the other hand, we can say 
that the translation grammar produces two languages: the primary language L’(G) 
and the translation language L” (G). The primary language is obtained from the lan- 
guage L(G) by removing translation symbols from its words. Then again, translation 
language is obtained from the language L(G) by removing primary symbols from 
its words. 


2.6.2 LL(1) grammars 


In this section, we assume that, for a context-free grammar G = (V, T, P, S), any 
word w € T* has a special end-of-word symbol < appended. This symbol is ne- 
ither nonterminal symbol, nor terminal symbol. It is used for marking end of any 
intermediate and terminal word of a derivation of the word w. 

Assume that p : A —> & is a production in a context-free grammar G = (V, T, P, S), 
where A € V, œ € {VUT}*. Let us define the following sets of symbols: 


e FIRST (p) = FIRST (œ) is the set of those terminal symbols, which may open 
any intermediate word derivable from æ. 

e FOLLOW (p) = FOLLOW (A) is the set of those terminal symbols and end-of- 
word symbol, which may directly follow A in any intermediate word derived 
from the beginning symbols S of the grammar G. 

_ J FIRST (p)U FOLLOW (p) ifp is nullable 

"SELEC eo otherwise 


Definition 2.7. A context-free grammar G = (V, T, P, S) is LL(1) grammar if and 
only if for every nonterminal symbol A € V all A-productions have SELECT sets 
pairwise disjoint. This condition is called LL(1) uniqueness condition. 


LL(1) grammars are tools for building top-down membership analyzer (top-down 
parser). LL(1) grammars are tools for constructing Leftmost derivation of input 
word, processing the word from Left to right. The derivation is constructed based 
on 1 input symbol at a time. 

Note that the uniqueness condition of the LL(1) grammars is similar to the Gre- 
ibach uniqueness condition. This allows for an easy construction of the derivation of 
a word: for a given leftmost nonterminal symbol A € V in an intermediate derivation 
word and a given input symbol a € T we apply this A-production to the nonterminal 
A, which has the terminal a in its SELECT set. When the right-hand side of the ap- 
plied production begins with the terminal symbol a, the input is shifted to the next 
input symbol. Translation symbols in an intermediate derivation words are skipped 
during this processing. 


Chapter 3 
Context-sensitive and unrestricted grammars 


The class of context-sensitive languages follows the class of context-free languages. 
In the hierarchy of languages, a so called Chomsky hierarchy, the next classes of lan- 
guages, besides regular languages and context-free languages, are context-sensitive 
and recursively enumerable languages. Context-sensitive languages are generated 
by context-sensitive grammars, which happen to be a generalization of context-free 
grammars. Recursively enumerable languages are generated by unrestricted gram- 
mars, which are an extension of context-sensitive grammars. We will also distingu- 
ish the class of recursive languages. However, there is no class of grammars gene- 
rating recursive languages. The class of recursive languages is separated from the 
class of recursively enumerable based on special class of automata. This topic will 
be presented in Chapter 4. 

As mentioned above, context-sensitive and unrestricted grammars are much more 
complex than context-free and regular grammars. Likewise, a structure of words 
of these classes of languages is much more complex than a structure of words of 
context-free languages. There are no properties showing restriction of words struc- 
ture or finiteness or regularity of the language structure. Algebraic characterization 
of these languages in the set of all words is not known. Therefore, we do not have 
effective tools for processing these languages, as it is in case of context-free lan- 
guages, e.g., such tool as the pumping lemma is not known for context-sensitive 
languages. 

This Chapter presents a short presentation of the basic properties of context- 
sensitive and recursively enumerable languages. 


3.1 Context-sensitive grammars 


Productions of context-sensitive grammars satisfy monotonic condition (also cal- 
led noncontracting condition). For this reason, context-sensitive grammars are also 
called monotonic grammars or noncontracting grammars. 


27 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


28 3 Context-sensitive and unrestricted grammars q 


Definition 3.1. A grammar G = {V, T, P, S} is context-sensitive if and only if its 
productions are monotonic (or noncontracting), i.e., they are of the form: 


a— P, where: a,B €(VUT)* and 0< |æ| < |p|. 
where |w| denotes a length of the word w. 


Definition 3.2. Context-sensitive languages are those generated by context-sensitive 
grammars, and only those. 


The class of context-sensitive grammars and the class of context-sensitive langu- 
ages are denoted by CSG and CSL, respectively. 

The monotonic condition excludes the empty word € from context-sensitive lan- 
guages. In terms of the above definition, any language, which includes the empty 
word, is not context-sensitive. This is the strict meaning of CSL and CSG classes. 

But it would be unreasonable to exclude from the CSL class a context-sensitive 
language with the empty word attached. Thus, any language L, such that L — {€ } is 
a context-sensitive language, will be included in the CSL class. Note, that having 
a language L generated a monotonic grammar, we can add the empty word € to L 
by attaching the production S — € to the grammar, where S is the initial symbol 
of the grammar. This production breaks monotonicity of the grammar, so it will be 
considered to be the unique exception of a context-sensitive grammar. This meaning 
of context-sensitivity is called extensive context-sensitivity. 

Summarizing the above notes, the CSL and CSG classes will be considered in 
the strict or extended sense depending on a context of discussion. In the sequel, we 
will not distinguish between strictness and extensiveness of context-sensitivity, if 
this does not lead to confusion. 


Definition 3.3. A grammar G = (V, T, P, S) is in context-sensitive normal form if 
and only if its productions have the following form: 


yAô— yad, where AEV,a,y,d€(VUT)*,a#e. 


y and 6 are called left and right context, respectively and A — @ is called the core 
of the production. 


Note that a grammar in a context-sensitive normal form is a context sensi- 
tive grammar, since its productions are monotonic. Then the class of grammars in 
context-sensitive normal form is included in the CSG class. The question is whether 
the CSG class is equivalent to the class of grammars in context-sensitive normal 
form. This question is equivalent to the question if for any context-sensitive gram- 
mar we can find a grammar in context-sensitive normal form. The answer is affirma- 
tive. Hence, grammars in context-sensitive normal form do not create a new class of 
grammars. Therefore, we will simply refer to context-sensitive grammars in normal 
form. 

Note that the core is simply a context-free production. Also observe, that only 
core of a production can affect derivation. However, the core of a production can 


EUROPEAN UNION S| 
EUROPEAN 
END 3.1 Context-sensitive grammars 29 


be used only if left and right context are preserved. This is why grammars with 
monotonic productions are called context-sensitive grammars. 

Now, let us justify that every context-sensitive grammar can be transformed to 
normal form: 


Lemma 3.1. Any context-sensitive grammar G = (V, T, P, S) can be transformed to 
an equivalent grammar in normal form (both grammars generate the same langu- 


age). 


Proof. We construct a context-sensitive grammar in normal form which is equiva- 
lent to the grammar G. A detailed proof is left to the reader. 

The grammar in normal form equivalent to a given context-sensitive grammar is 
constructed as follows: 


1. for every terminal symbol a € T 


a. create a new non-terminal symbol A, and convert every production of the 
grammar G replacing every occurrence of the symbol a with the new non- 
terminal Aq, 

b. add the new production Ag > a. 


We get a new grammar Gr = (V U Vr, T, P'U Pr, S) with an extended set V U Vr 
of nonterminal symbols and an extended set P'U Pr of productions, where Vr = 
{Aq: a ET}, P’ is the set of productions converted from P, Pr = {Ag > aa ET}. 
Productions of P’ have now a form A; A2...A, — By B2...B;, where k < l (since 
the grammar G is a context-sensitive, i.e. it is a monotonic grammar) and all 
symbols in the production are nonterminal symbols, i.e. Aj,..., Ax, Bi,..., By € 
VUVr, 

2. let us split the set P’ of productions to subsets P’ (corresponding to the subset P, 
of P) of productions in normal form and P’, (corresponding to the subset P, of 
P) of productions that are monotonic, but not in normal form, P = P, U Pn and 
P' = P} U P! . Let us enumerate productions of the set P/,, which are not in normal 
form, 

3. for every production r: A, A2...Ay, — Bı B2...B; of the set P/, with assigned 
number r do: 


a. create a new nonterminal symbols A7, 
b. replace the production r: A; A2...A, — Bı Bz2...B; with the following set ry 
of k+ 1 productions in normal form: 
o Ay...Ap_1 Ag > A,...Ay_1 Aj, where y = A, ...Aj_1 is the left context, 
the right context is empty and A; — A; is the core, 
o Aj A2...Ag—1 A, > Bi A2...Ax—1 Aj, where the left context is empty, ô= 
A2A3 ...Ag—1 Aj is the right context and A; — By is the core, 
o B1A243...Ak—-1 A, — Bı Br A3...Ax—1 Ay, where y= A is the left context, 
6=A3.. .A;_1 4; is the right context and A; — B; is the core, 


Bı.. .Bk-2Ak-1 Aj — B,...By_2By_ Aj, where y = Bı ... By_2 is the left 
context, 6 = Aj, is the right context and Ay_; — B,_ is the core, 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


30 3 Context-sensitive and unrestricted grammars q 


o By...By_1 A, > Bag... By) By...B), where y = By... By_, is the left con- 
text, the right context is empty and A; — Bx . . B; is the core, 


4. finally we come up with the following context-sensitive grammar in normal form 
Gy — (Vv, T, Py, S), where: 


o Vw =VUVrU U, {4%} and r is the number of a production A; A2...A, > 
Bı B2... B; from P}, 


o Py = Pr UP, UU,Ry, where Ry is the set of productions in normal form 
corresponding to the production r of Py. 


3.2 Unrestricted grammars 


The class of unrestricted grammars is the more general class of grammars. Unre- 
stricted grammars are similar to context-sensitive grammars except that productions 
are not required to be monotonic (noncontracting). Unrestricted grammars generate 
the class of recursively enumerable languages, which will be denoted as REL class 
of languages. The formal definitions are as follows: 


Definition 3.4. A grammar G = {V, T, P, S} is unrestricted if and only if its produc- 
tions are of the form: 


a—>B, where: a, B €¢(VUT)* and 0< |a| 
where: |w| denotes length of the word w. 


Definition 3.5. Recursively enumerable languages are those generated by unrestric- 
ted grammars, and only those. 


Context-sensitive grammars are special cases of unrestricted grammars. Thus, 
the class of context-sensitive languages is a subclass of recursively enumerable 
languages, i.e. CSL C REL. But it is not obvious if this inclusion is proper, i.e. if 
CSL + REL because nearly every language that we can imagine is context-sensitive. 
In the consecutive Chapters we will construct languages that are recursively enume- 
rable, but not context-sensitive. 


Chapter 4 
Turing machines 


In previous chapters methods of generating languages were studied. Those methods 
are based on different types of grammars and on regular expressions. This and next 
chapters are devoted to a discussion on methods of acceptation of languages. Accep- 
tation of languages is based on different types of automata: Turing machines, linear 
automata, push-down automata and finite automata. Identification of grammars with 
generation of languages and automata with acceptation of languages is a subjective 
and intuitive categorization done by authors. However, it reflects nature of tools for 
processing languages. 

Turing machines (and other types of automata) can be interpreted as models of 
computation. Turing machines is a universal model of computation that is used for 
such purposes as, for instance, acceptation of languages, computing functions, so- 
lving problems. 

In this Chapter Turing machines, and automata in general, will be employed as 
tools of acceptation of languages, i.e. they will be queried whether given word is in 
the language accepted by a given automaton or not. 

Turing machines can also compute functions. Such machines compute functions 
with natural numbers as domain and co-domain. Another type of Turing machines 
solves problems like, for instance, the sorting problem, the shortest paths problem, 
etc. 


4.1 Deterministic Turing machines 


In this book two categories of Turing machines (automata, in general) will be stu- 
died: deterministic and nondeterministic. Roughly speaking, computation of an au- 
tomaton is a sequence of configurations organized according to some control in- 
formation. An automaton is a deterministic one if and only if there is at most one 
possibility of doing a transition in any configuration. If, for given automaton, there 
is a choice of doing a transition in some configuration, then such an automaton is a 
nondeterministic one. 


31 


g HUMAN CAPITAL 
i fi NATIONAL COHESION STRATEGY 
32 4 Turing machines 


In this section different categories of deterministic Turing machines are studied: 
a basic model, a model with guard, a multi-track model and a multi-tape model. 
At the end of this Chapter nondeterministic Turing machines are discussed. Equ- 
ivalence of these categories of Turing machines is drawn, i.e. it is shown that for 
a Turing machine in any model, we can find an equivalent machine in any other 
model. Equivalence of Turing machines (equivalence of automata, in general) me- 
ans that they accept the same language, compute the same function or solve the 
same problem. The discussion leads to the main goal of this Chapter: equivalence 
of deterministic and nondeterministic Turing machines. 


4.1.1 Basic model of Turing machines 


A definition of the basic model of deterministic Turing machine is given below. 
Later in this chapter other deterministic models of Turing machines are discussed. 
They are proved to be equivalent with the basic model. As it was stated above, equ- 
ivalence with regard to accepted languages is considered. However, generalization 
of equivalence to Turing machines computing functions or solving problems is stra- 
ightforward. 


Definition 4.1. A Turing machine in basic model is a system 
M = (Q, x, rT, ô, q0, B, F, C) 


with components as follows: 

— afinite set of states, 

— afinite set of tape symbols (tape alphabet), 

— the blank symbol (of tape alphabet), B € I, 

— an input alphabet, £ C (I — {B}), 

— the initial state, go € Q, 

— aset of accepting states, F C Q, 

— a condition, its satisfaction is necessary and sufficient to stop computation, 

— atransition function, which is a mapping: 
6:Q0xIT +>QxI x{L,R} 

where L, R denote left and right directions. 


MAME MATS 


A Turing machine could be interpreted as a physical mechanism shown in Fi- 
gure 7.1. This mechanism consists of: 


e acontrol unit, it is in a state of Q, 

e aone way infinite tape split in cells, every cell contain a symbol (exactly one) of 
tape alphabet I, 

e ahead, it is placed over a cell of a tape, it reads a symbol hold in a cell, it stores 
desired symbol in the cell, it shifts left or right. 


Turing machines do computation for given input data. A computation of a given 
Turing machine is done according to the following intuitive procedure: 


EUROPEAN UNION fel 
EUROPEAN 
REBUNG 4.1 Deterministic Turing machines 33 


aj | a | a3 aes [an B|B 
l the tape 
the head 
qo the control unit 


Fig. 4.1 Basic model of Turing machine. 


1. the initial configuration of a given Turing machine is described as follows: 


a. input data, a word w = a142.. .an over input alphabet X, is stored in n begin- 
ning cells of the tape, c.f. Figure 7.1, 

b. all other cells of the tape, which is infinite to the right, are filled in with the 
blank symbol B, 

c. the head of the Turing machine is placed over the first (leftmost) cell of the 
tape, 

d. the control unit is in the beginning state qo, 


2. if the stop condition C is satisfied, then computation is halted. The configuration 
is called the final configuration. The machine responses whether its control unit 
is in an accepting state or not, 

3. if the stop condition C is not satisfied, then - based on the state q of the control 
unit and on the symbol X read by the head - the Turing machine is doing the 
following actions: 


a. the value (p,Y, D) of the transition function 5(g, X) is computed, 
b. the head stores the tape symbol Y in the cell under it, 

c. the control unit switches to the state p, 
d. the head shifts one cell in the direction D, 


4. computation goes to the point 2. 


The above intuitive procedure could be adapted to Turing machines, which com- 
pute functions or solve problems. This adaptation needs a redefinition of input data. 
Input data of a Turing machine is: 


e a sequence of arguments of the function computed by the machine. Arguments 
of a function are encoded as numbers e.g. in binary or decimal positional system 
or in unary system. Of course, some sort of separators between arguments must 
be used, 

e a data defining an instance of the problem solved by the machine. This data is 
encoded in a way depending on type of data. 


Note that the stop condition C can never be satisfied and the machine will be 
doing an infinite computation. When infinite computation is done, the input data is 


g HUMAN CAPITAL 
A r NATIONAL COHESION STRATEGY 
34 4 Turing machines 


not accepted by the machine, i.e. the input word does not belong to the language ac- 
cepted by the Turing machine. It should be underlined that a Turing machine always 
stops its computation in an accepting state when the input data of this machine is 
a word of the language accepted by the machine. Turing machines raise a difficult 
problem: when a one is doing long computation, there is no assertion if the machine 
has fallen into infinite computation or it will stop computation in future. 


Remark 4.1. Turing machines will be designed assuming that, if a machine termina- 
tes computation, its head is placed over the first (leftmost) cell and: 


e when the machine accepts a language, then all cells of the taped are filled in with 
the blank symbol B, 

e when the machine computes a function, then the value of the function is stored in 
beginning cells of the. All other cells are filled in with the blank symbol B . The 
stored value is the correct result if and only if the machine stopped computation 
in an accepting state, 

e when the machine solves a problem, then output data is stored in beginning cells 
of the tape. All other cells of the tape are filled in with the blank symbol B. Output 
data is a correct solution of a computed instance of the problem if and only if the 
machine stops computation in an accepting state. 


Assumptions of the above remark are not included into Definition 4.1, but they 
are wished for clarity of computation. An epilogue of a computation guarantying 
satisfaction of the above assumptions is called a cleaning procedure. 


Definition 4.2. A step description (a configuration) of a Turing machine 
M = (Q,I,2, 6,90,B,F, C) 


is a sequence of symbols: 
a1 q %2 


where: 


e q €Q is the current state of the control unit of a Turing machine, 

e œ is a sequence of symbols stored in cells beginning from the leftmost one and 
ending with the cell prior the one under the head, 

e Qp is a sequence of symbols stored in cells beginning from the one under the 
head, going to the right and ending with the rightmost one holding a non-blank 
symbol. 


Note that both œ; and œ sequences of symbols are words over the tape alphabet 
T and that any of these sequences may be the empty word. However, none of these 
two sequences can be infinite. This is due to the following reasons: 


e an input data is finite, so then only a finite number of cells are filled in with 
non-blank symbols in the input configuration, 


EUROPEAN UNION [S| 
EUROPEAN 
SOCIALFUND 4.1 Deterministic Turing machines 35 


e after any step of computation only finite number of cells could be visited by the 
head. 


For instance, the initial step description (configuration) is of the form qo w, where 
qo is the initial state, w is the input data. In this case œ is the empty word. On the 
other hand, a step description a; q informs that all cells from the one under the head 
to the right are filled in with the blank symbol B. Now, a2 is the empty word. Finally, 
when a Turing machine accepting a language ends its computation in a state q, then 
- according to Remark 4.1 - q will be the final step description. 

Let us analyze transitions done by Turing machines. We assume that a step de- 
scription is characterized by the following sequence of symbols: 


Xı Xa ... Xi—1 q Xi... Xn 


Recall that: 


e qis the current state of the control unit, 

e X,X2...X;_; is the sequence of symbols of the tape alphabet I stored in cells 
preceding the cell under the head, 

e X;Xj+1 ... Xn is the sequence of symbols stored in cells beginning from the one 
under the head, going to the right and ending with the rightmost one containing 
non-blank symbol, 

e the head is placed over the cell with the X; symbol stored in. If the sequence 
Xj Xi+1 ...X, is the empty word, then the head reads the blank symbol B. 


The next step description is determined by the transition function: 


e if the value of the transition function is ô(q,X;) = (p,Y,R), i.e. the control unit 
switches to the state p, the head stores Y and shifts right, then we get the follo- 
wing configuration, 

Xı X20... Xi—1Y p Xin... Xn 


e if the value of the transition function is ô(q,X;) = (p,Y,L), then we get the fol- 
lowing configuration: 


X, X2 ... Xi—2 P Xi-1 Y X44 we XY 


We will use the symbol > to denote a transition of a Turing machine. The transition 
symbol > may be supplement with a Turing machine name > y to emphasize that 
a transition concerns a given Turing machine. It also can be supplemented with a 
superscript >t to notify k transitions done. 

The above two transitions done by a Turing machine will be denoted as follows: 


XX... Xi—1q X....Xn > XX... Xi-1 Y P Xin... Xn 


XiX ... Xi-1q Xj... Xn > X1 X20... Xi-2 p Xi—1 Y Xi+1 ... Xn 


Definition 4.3. Transitions of a Turing machine create a binary relation in the space 
of all possible configurations of the machine, i.e. any two configurations are related 


g HUMAN CAPITAL 
m fa NATIONAL COHESION STRATEGY 
36 4 Turing machines 


if and only if the second one is derived from the first one utilizing the transition 
function. This relation is called the transition relation of a given Turing machine. 
The transitive closure of the transition relation is denoted >*. 


Definition 4.4. A computation of a Turing machine M = (Q,IT,2,6,q0,B,F,C) is 
a sequence of configurations 1), 1]2,...,1, such that n; is the beginning configu- 
ration, Nn is the final configuration and a pair of any two successive configurations 
belongs to the transition relation. If the machine has fallen into an infinite compu- 
tation, then its computation is an infinite sequence of configurations 1),12,73... 
such that 7) is the beginning configuration and any pair of two successive con- 
figurations belongs to the transition relation. A finite computations is denoted as 
nı > M2 >... > Nn and infinite computation is denoted as nı > N2 > 3 >... 


Now we give a formal definition of acceptation of an input by a Turing machine: 


Definition 4.5. A Turing machine accepts its input if and only if the computation 
terminates in an accepting state. In other words, a Turing machine accepts its input 
if and only if the pair of the initial configuration and the final configuration belongs 
to transitive closure of the transitive relation i.e. nı >* Nr, where nı is an initial 
configuration and ny is a final configuration. 


Based on the above discussion we give now formal definitions of some concepts. 


Definition 4.6. The language L(M) accepted by a Turing machine M 
is the set of words w € &* accepted by the Turing machine. 


Definition 4.7. A function computed by a Turing machine M is a mapping from a 
space of input data into a space of output data. If the machine accepts its input, then 
its output is the correct value of the function. Otherwise, when the machine stops 
computation, but not accepts the input, or if it is doing infinite computation, the 
function is undefined for such input data. 


Remark 4.2. We assume that the blank symbols B will neither separate non-blank 
symbol on tape, nor be placed in leftmost cells prior to a non-blank symbol. This 
assumption is not required by definitions and concepts discussed so far. However, it 
simplifies designing of Turing machines for given tasks. 


4.1.2 Turing machine with the stop property 


As noticed before, some Turing machines can fall into an infinite computation. Some 
other will terminate their computation for any input data. This observation draws the 
definition of the subclass of Turing machines which always terminate their compu- 
tation. 


EUROPEAN UNION Ei 
EUROPEAN 
SOCIAL FUND 4.1 Deterministic Turing machines 37 


Definition 4.8. A Turing machine in basic model with the stop property is a system 
introduced in Definition 4.1 


M = (0,2 ,T',6,q0,B,F,C) 


and such that it terminates its computation for any input data. 


It is worth to underline that a function or a language computed by a Turing ma- 
chine with the stop property can also be computed by a Turing machine without stop 
property. Moreover, such the Turing machine without stop property can perform in- 
finite computation for some input data. 

Two Turing machines, one with the stop property and another one without the 
stop property, if compute the same function or accept the same language, must ter- 
minate computation in accepting states for the same input data. Both machines may 
also terminate computation in rejecting states for the same data. They may yield 
different outputs only for such input data, which is not accepted. In such the case, 
the machine with the stop property terminates its computation in a rejecting state 
and the second machine falls in infinite computation. 


Definition 4.9. Turing machine are considered to be equivalent if and only if they 
terminate computation in accepting state for the same input data. 


4.1.3 Simplifying the stop condition 


Now we will slightly change definitions of Turing machines in basic model in order 
to simplify definition of termination of its computation. 


Definition 4.10. Turing machine with the halting accepting state is a system: 
M = (O,2,T, 6,q0,B,F) 


where: 


e F = {qa} - there is only one accepting state qa, 

e the stop condition is satisfied if and only if the machine switches to the accepting 
state q4, 

e other components of the system are as described in Definition 4.1. 


Proposition 4.1. Turing machines with the halting accepting state are equivalent to 
Turing machines in basic model. 


Proof. First of all, a Turing machine M = (Q,2,I,5,q0,B,{qa}), with halting ac- 
cepting state will formally match Definition 1, when the stop condition is included 
in the system, i.e. M = (Q,2,I',5,q0,B,{qa},C), where C is satisfied if and only if 
the machine switches to state q4. 

On the other hand, a Turing machine in basic model M = (Q,2,I',6,q0,B,F,C) 
can be updated to a machine with halting accepting state by: 


g HUMAN CAPITAL 
A r NATIONAL COHESION STRATEGY 
38 4 Turing machines 


adding new states q4 and qa, 

doing two transitions: (q#,X,R) (ga,Y,L), when the stop condition of the Turing 
machine in basic model is satisfied and the machine is in accepting state, where 
X and Y are symbols previously stored in cells under the head. These two trans- 
itions just switch the machine to the new state q4, keeping contents of the tape 
unaffected and places the head in the same position as before these transitions, 
replacing former accepting states by q4, 

redefining the stop condition: computation is halted if and only if the machine 
switches to the state q4 (now the only accepting state). 


The machine with halting accepting state may fall into infinite computation when 
the machine in basic model stops its computation in non-accepting state. Anyway, 
the machine with the halting accepting state terminates its computation in accepting 
state if and only if the machine in basic model does the same. 

This proves equivalence of both machines with regard to accepted language. The- 
refore, Turing machines in basic model are equivalent to Turing machine with hal- 
ting accepting state. 


Definition 4.11. Turing machine with halting states is a system 
M = (O,2,T, 5,q0,B,F,R) 


such that it terminates its computation for any input data, where: 


e F = {qa} - includes only one accepting state q4, 

R = {qr} - includes a special non-accepting state qr, 
computation for any input always reaches one of states qA or gr, 
computation stops if and only if it reaches q4 or qr, 

other components of the system are as described in Definition 4.1. 


Proposition 4.2. Turing machines with halting states are equivalent to Turing ma- 
chines in basic model with the stop property. 


Proof. Just modify the proof of Proposition 4.1 to justify this Proposition. 


4.1.4 Guarding the tape beginning 


Basic model of Turing machines raises a practical problem how to detect the tape 
beginning. Of course, a general solution of this problem is just to store special sym- 
bols in the first cell of the tape, which replace symbols defined by transition func- 
tion. However, such a solution enlarges the set of states, the tape alphabet and the 
transition function. A model of Turing machine with guard avoids this problem. 


EUROPEAN UNION [Pe 
EUROPEAN 
RERNE 4.1 Deterministic Turing machines 39 


# a; |a a3 ys a, |B |B 


the tape 


qo the control unit 


Fig. 4.2 Turing machine in basic model with guard. 


the head 


Definition 4.12. Turing machine in basic model with guard is a system 
M = (O,2,T, 6,q0,B,#,F,C) 


where: 


# - the guard symbol (of tape alphabet), # € T, # ¢ X, 
other components are as in Definition 4.1, 
the input configuration is shown in Figure 4.2, 


e 
e 
e 
e the head can visit the first cell (with guard), but cannot change its contents. 


Proposition 4.3. Turing machines in basic mode with guard are equivalent to Tu- 
ring machines in basic model. 


Proof. Given a Turing machine in basic model 
M = (Q,2,T, 6,90,B,F, C) 


its equivalence with guard could be constructed by supplementing the system with 
the guard symbol, storing the guard symbol in the leftmost cell, storing input data 
from the second leftmost cell and placing the head over the second leftmost cell 
(over the leftmost input symbol) 


Mg = (Q,2,I,6,q0,B,#,F,C) 


Other components of the Mg stay unchanged. Any computation of Mg will be exac- 
tly the same as for M and the head will never visit the guard cell. 
And oppositely. Given a Turing machine in basic model with guard 


Mg = (0,2,0',6,q0,B,#,F,C) 
the following machine is equivalent 
M' = (O',2,0,6',q0,B, F.C’) 


The machine M will do the following computation: 


g HUMAN CAPITAL 
m fa NATIONAL COHESION STRATEGY 
40 4 Turing machines 


e shifts input data one cell right, stores the guard symbol # in the first cell and 
leaves the head at the first input symbol (on the second leftmost cell), 

e simulates computation of Mg, 

e shifts the output data one cell left (this deletes the guard symbol in the leftmost 
cell) and leaves the head at the leftmost output symbol (at the leftmost cell). 


Remark 4.3. The other models of Turing machines, e.g. Turing machines with the 
stop property or Turing machines with halting states, can be turned to models with 
guard. Formulation and proof of equivalence of models with guard and other models 
of Turing machines are analogous to the proof of Proposition 4.3. 


4.1.5 Turing machines with a multi-track tape 


A Turing machine with a multi-track tape has the tape split at its length to a given 
number of tracks. Every track is split into cells and is one way infinite. The head 
reads symbols of all cells of the same slice (column) and shifts simultaneously over 
all tracks. An initial configuration of a Turing machine k-tracks tape is shown in 
Figure 4.3. Input data is stored in beginning cells of the track no 1 while all other 
cells of the track no 1 and all cells of other tracks are filled in with the blank symbol 
B. 


Definition 4.13. Turing machine in basic model with a multi-track (k-tracks) tape is 
a system 
M = (Q,2,T, 6,90,B,F, C) 


where: 


e 8:QxT* +QxTI* x {L,R} is the transition function 

e input data w = a142.. . an is represented as sequence of k-tuples 
((a1,B,...,B), (a2,B,...,B),...,(an,B,...,B)), every k-tuple fills in one slice of 
the tape, c.f. Figure 4.3. 

e output data is represented in a way similar to representation of input data, i.e. is 
stored in beginning cells of the first track while all other cells are filled in with 
the blank symbol, 

e other components are as in Definition 4.1. 


Proposition 4.4. Turing machines with a multi-track tape are equivalent to Turing 
machines in basic model. 


Proof. First of all, a Turing machine in basic model is a special case of a Turing 
machine with a multi-track tape having just one track. 
Secondly, a Turing machine with k-tracks tape: 


M' = (Q,2',T',5,q0,B',F,C) 


EUROPEAN UNION re 
EUROPEAN 
REBUNG 4.1 Deterministic Turing machines 41 


track no 
1 ay |a an BIB 
2 BIB B|B/B 
the tape 
k BIB ee B|B|B 
the head 
do the control unit 
the head 


Fig. 4.3 Turing machine in basic model with a multi-track tape. 


is equivalent to the following Turing machine in basic model: 
M = (Q,2,T, 6,90,B,F, C) 


where: 


E= X'x{B}x{B}x...x{B}— product of k sets 
r =I'xI'x...xI product of k sets 

B = (B',B',...,B')— k— tuple 

other components are as in machine M’. 


The above conclusion is not surprising. Turing machines with a multi-track tape 
work in a way quite similar to Turing machines in basic model, i.e. they have one- 
way infinite tape, their one head reads and writes data of the whole slice at a time, 
etc. Therefore, when data of a slice is interpreted as one symbol of a tape alphabet, a 
Turing machine with a multi-track tape is quite similar to a Turing machine in basic 
model. Turing machines with a multi-track tape are essentially useful in proofs of 
equivalence of other, more important, models of Turing machines. This is the main 
motivation for discussing this model. 


4.1.6 Turing machines with two-way infinite tape 


Variations of Turing machines discussed so far are slightly modified Turing ma- 
chines in basic model. Two-way infinite tape is the first important modification of 
Turing machines in basic model, c.f. Figure 4.4. When Remark 4.2 is employed, 
Turing machines with two-way infinite tape permit avoiding problems with passing 
head to beginning of tape or to beginning of data stored on tape: the first cell with 
the blank symbol prior to nonblank symbols indicates the beginning of data stored 
on the tape. 


g HUMAN CAPITAL 
m fs NATIONAL COHESION STRATEGY 
42 4 Turing machines 


B |B | ag | a, | ay ae a,|B/B... 
the tape 
the head 
do the control unit 


Fig. 4.4 Turing machine with two-way infinite tape. 


Definition of Turing machine with two-way infinite tape is identical with De- 
finition 4.1. Turing machine with two-way infinite tape does not need to identify 
the beginning of tape. However, definition of configuration (step description) of this 
type of machines is slightly different than that of machines in basic model. The 
difference is in description of the left sequence of symbols. 


Definition 4.14. A step description (a configuration) of a Turing machine with two- 
way infinite tape M = (O,T',2,6,q0,B,F,C) is the following sequence of symbols: 


a q %2 


where: 


e q €Q and Q are the same as in Definition 4.2, 

e QQ; is the sequence of symbols stored in cells left of the head starting with a 
leftmost non-blank symbol and ending with symbol in the cell prior the one under 
the head. 


Proposition 4.5. Turing machines with two-way infinite tape are equivalent to Tu- 
ring machines in basic model. 


Proof. We prove that Turing machines with two-way infinite tape and Turing ma- 
chines with a multi-track tape are equivalent. Because Turing machines with a multi- 
track tape are equivalent to Turing machines in basic model, c.f. Proposition 4.13, 
then we get equivalence declared in this Proposition. 

For a given Turing machine in basic model MM, an equivalent Turing machine 
with two-way infinite tape M2 is equal to M1. Computation of M; is being done on 
the right half of tape (the right part of the tape which begins with the cell holding 
the first symbol of input data). 

Assuming that a machine with two-way infinite tape is given as follows: 


My = (Q2, £, 12, &, dh, B2,F,C) 
we will construct a machine in basic model: 


Mı — (01,2,,51,9),B1,F,C) 


EUROPEAN UNION fe 
EUROPEAN 
REBUNG 4.1 Deterministic Turing machines 43 


ao | aq | ag fe a, | B |B 
B|B|B B|B|B 
l the tape 
the head 
qo the control unit 


Fig. 4.5 Turing machine with two-track tape - simulation of two-way infinite tape. 


First of all, a method of representation of a two way infinite tape must be found 
out. A one-way infinite tape is a 2-tracks tape. The right half of the two-way infinite 
tape matches to the upper track and the left half rotated by 180° matches to the lower 
track, c.f. Figure 4.4 and Figure 4.5 

Computation of M2 done on the right half of tape is followed on the upper track 
of Mı. Computation of M3 done on the left half of tape is followed on the lower 
track of Mı with the head shifting oppositely than the head of M2. Note that the first 
cell of the lower track plays a role of guard while content of the left half of tape is 
stored beginning with the second cell, c.f. Figure 4.6. 

Formal description of Mı simulating M2 is as follows: 


© Qı =Q x {U,B}U {40} 
e H=hxhuhx{f} 
e X=% x {Bo} 
e Fi ={(4,U),(¢,L):4€ Fo} 
e B= (B2,B2) 
Symbols U and L depict placement of the head of M3 on the right and the left 
half of the tape. 


The first transition of M, stores guard symbol in the first cell of the lower track 
and simulates the first transition of M2 going to either the upper, or the lower track: 
U), (Y, 4),R) if 82(99,40) = (p,¥,R) 
ô n ao, B She I , . 0) st, 
(ao 40-2) =] ((p,t),(¥, A,R) if Blaa) = (P.¥,L) 
When the head of M; is placed right of the first cell (computation of Mz cannot 
change current half of the tape) then: 


ôI ((q,U), (X,Z)) = ((p,U), (Y,Z),A) \ ; 
= ô (4,X) = (p,Y,A 
8 (@L),Z,X)) = oA fF RA 
where: A is a direction of the head shift, A is opposite to A. 
When the head of Mj is placed at the first cell (computation of M2 may change 
current half of the tape) then: 
ôi ((q,U), (X, f)) = ((p,U), (Y, £),R) \ i 
X) = (p,Y,R 
6 (a1), X, A) = (2,0), Y, A,R) fF (GX) = (PYR) 


g HUMAN CAPITAL 
$ ‘a NATIONAL COHESION STRATEGY 
44 4 Turing machines 


a two-way infinite tape 


B |B [X4]... RRR Ae [X;|B | B 


Xo|X1|Xo [Xs B 
¢ XX2 AY Xa BIB oF 
pence a 2-track tape 
do the control unit 


Fig. 4.6 Turing machine with two-track tape - simulation of computation of Turing machine with 
two-way infinite tape. 


ôi ((4,U), (X, £)) ((p,L), (Y, É) R) . = 
5: ((q.L).(X, Ø) = ((p.L).(¥, £8) \ oa = ety 


where, 


e U and L, coming in states labels, stand for upper and lower track, 
e Land R, coming as the third element of the transition function (3-tuple) value, 
stand for left and right direction of head shifts. 


In light of Proposition 4.4 and Proposition 4.5 the following conclusion is fairly 
obvious: 


Proposition 4.6. Turing machines with a multi-track two-way infinite tape are equ- 
ivalent to Turing machines in basic model. 


4.1.7 Multi-tape Turing machines 


A multi-tape Turing machine, c.f. Figure 4.7, satisfies the assumptions: 


e it has several tapes and one head for every tape, 
e tapes are two-way infinite, 
e the initial configuration assumes: 


— the control unit is in the initial state, 

— input data is stored on the first tape, 

— the head of the first tape is placed over the first (leftmost) symbol of input 
data, 

— all other cells of the first tape and all cells of other tapes are filled in with the 
blank symbol, 


a transition of a multi-tape Turing machine is as follows: 


EUROPEAN UNION re 
EUROPEAN 
see 4.1 Deterministic Turing machines 45 


|---| B |B Rx. | HEA- -OEE 
the tape no 1 ——— T the head no I 


the tape no 2 


the head no 2 
|- |B |B RaKa Xe] | ERA FTAA 
the tape no k 
the head no k 


q the control unit 


— the control unit switches to some state, 

— every head stores a tape symbols in its cell, 

— every head shifts left, right or stays in current position independently on other 
heads. 


Fig. 4.7 A multi-tape Turing machine. 


Formal definition of a multi-tape Turing machine is as follows: 
Definition 4.15. A multi-tape Turing machine (with k-tapes) is a system 
M= (Q,£, x D x... x I}, ô,qo,B,F,C) 


where: 


e Ii, D,..., Iņ are alphabets of tapes. It is assumed that all tapes have the same 
alphabet I, if not stated differently, 

e 6:0x(RxLx...xI,) 3 Ox (Rx GBx...x Ij) x {L,R,S}* is the transition 
function with directions of head shift: left, right and stop (no shift of a head), 

e other components are as in Definition 4.1. 


Proposition 4.7. A multi-tape Turing machines are equivalent to Turing machines 
in basic model. 


Proof. A Turing machine in basic model is equivalent to some Turing machine with 
two-way infinite tape, c.f. Proposition 4.5. Moreover, a Turing machine with two- 
way infinite tape is a case of a multi-tape Turing machine; it is just a multi-tape 
Turing machine with one tape. Therefore, for any Turing machine in basic model 
we can directly find an equivalent multi-tape Turing machine. 


g HUMAN CAPITAL 
m r NATIONAL COHESION STRATEGY 
46 4 Turing machines 


> the l-st tape 


the 2-nd tape 


the k-th tape 


the control unit 


Fig. 4.8 Turing machine with a multi-track two-ways infinite tape simulating a multi-tape Turing 
machine. The blank symbol is not printed for the sake of clarity, i.e. all cells, which are empty in 
this Figure, hold the blank symbol. 


Now, we give an idea of construction of a Turing machine with a multi-track 
two-way infinite tape M; for a given k-tapes Turing machine Mn: 


e k-tapes are represented on a 2-k-tracks two-ways infinite tape, every tape corre- 
sponds to a pair of tracks, c.f. Figure 4.8 


— contents of every tape is stored on bottom track of the corresponding pair, 

— the special symbol H stored in a cell of upper track of the corresponding pair 
marks the head position of the tape of Mm, 

— all other cells of both tracks of the pair are filled in with the empty symbol B, 


e initial configuration of M, is as follows: 


— the initial state of M, is stored in the relevant state of M1, 

— input data stored on the first tape of a multi-tape Turing machine is represented 
on the lower track of the first pair of tracks of M1, 

— the head markers of all tapes of Mm are placed in the tape slice holding the 
first symbol of input data, 


e atransition of the machine Mm is simulated by several transitions of the machine 
Mı: 


— the head of Mı passes from the tape slice holding the leftmost head marker H 
to the tape slice holding the rightmost head marker. This is so called collect- 
data pass. Information about symbols under all tape markers (under heads of 
Mm) is collected during this pass and remembered in a relevant state of M1, 

— the transition function of Mm is applied to data collected (state of Mm and 
symbols under heads of Mm). The result of transition function is remembered 


EUROPEAN UNION fe 
EUROPEAN 
SOCIAL FUND 4.1 Deterministic Turing machines 47 


in a state of Mı. Recall that transition function of M, returns: a state, symbols 
to be stored by heads, directions of heads shifts, 

— the head of M; is passed to the time slice holding the leftmost tape marker H. 
This is so called update pass. Symbols under head markers and head markers’ 
positions are updated during this pass. Updates are done according to the value 
of transition function obtained in the previous point. 


e computation of M; is terminated if and only if the stop condition of Mpm is satis- 
fied. Input data of M is accepted if and only if Mm terminates computation in an 
accepting state. 


Note that simulation of a multi-tape Turing machines by Turing machines with 
one tape requires a huge set of states, what makes transition function highly enlar- 
ged. It also increases number of transitions for the same input data. 

As it is stated in Proposition 4.7, data collected during left to right pass of the 
head on simulating machine are stored in the set of states. So, a normal set of states 
Qı of the simulating machine should be extended by Cartesian product of: 


e a pass direction: Q; x {C,U} to indicate if this is collect-data pass or update 
pass, 

e the set of states of a multi-tape machine: Qı x {C,U} x Qn to keep a state of Mm 
during both passes, 

e alphabets of tapes: Q1 x {C,U} x Qm x Ti x In x... x I} for symbols under heads 
during both passes, 

e markers visited heads, this might be done by extending tapes’ alphabets with a 
special marker-not-visited-yet symbol h: Qı x {C,U} x Qm X IY x IJ x... x I, 
where I;’ = I} U {h} for any tape i, 

e a value of the transition function 6,, is stored in already included components 
Om x TY x Ty x... x Ij! with update pass indication, 

e heads’ shifts Q1 x {C,U} X Qm xT x IJ x... x IX x {L,R,S} 

e update heads’ positions counters would require four symbols 
O1x{C,U} x Om xT x... x Tif x {L,R, S}* x {X4, X", XX} for every tape, 

e and more states is required to organize simulation details. 


Thus, states could be labelled by elements of the above Cartesian product, i.e. by 
(3k+3)-tuples. So then, the number of states is not less than 
r = 2«3* x44 x Oil *|Om| * D| * |]... Ul 

On the other hand, symbols of the tape alphabet of a simulating machine could 
be denoted by all tuples of the Cartesian product of tapes’ alphabets and symbols 
stored in tracks with heads’ markers (the blank symbol B and the head’s marker 
symbol H) {H,B} x Ii x{H,B} xI} x... x {H,B} x I}. Thus, the number of tape 
symbols of simulating machine is equal to c = 2% x |Ñ | * |Ib| * .. x |I}]. 

Finally, size of the transition table of the simulating machine is not less than r * c 
comparing to size of the transition table of a multi-tape Turing machine, which is 
equal to |Qm| * (|| * |D] *...* Ikl). 

Let us analyze possible increase of number of transitions. The worse case is when 
the head of one tape of a multi-tape Turing machine always shifts right and the head 


g HUMAN CAPITAL 
m fa NATIONAL COHESION STRATEGY 
48 4 Turing machines 


of another tape always shifts left. Tape slices with these two heads’ markers will be 
separated by: 


e one tape slices after simulation of the first transition, 
e three slices after simulation of two transitions, 
e 2xn-— | tape slices after simulation of n transitions. 


Simulation of transitions of a multi-tape Turing machine, as described above, 
requires following numbers of transitions of simulating machine: 


three transitions for simulation of the first transition, 

at least seven transitions for simulation of the second transition, 

at least 4 xn — 1 transitions for simulation of the n — th transition. This number 
will be increased by transitions updating markers of those heads of a multi-tape 
machine, which shift right. The increment will not exceed 2 « (k — 2), where k is 
the number of tapes. 


As a result, simulation of transitions of a multi-tape Turing machine requires 
not less than 


f(m) =34+74+114+...4+(4*n-1)+n«(k-2) = 


4A484...+44n4+(k—3)*n = 2*n?+(k—1)*n 


transitions of simulating machine. This number could be estimated: 
2«n* < f(n) <3 xn? for n big enough. Therefore, we can say that simulation in- 
creases cost of computation with square. 

Definition of a configuration (step description) of a multi-tape machine must 
consider contents of several tapes and one state of control unit: 


Definition 4.16. A configuration (step description) of a multi-tape Turing machine 
with k-tapes is the following sequence of symbols 


(al, a?,... af) q (OG OE x65, 08) 


where: 


e qisa state of control unit, 

° ai is a sequence of symbols for i-th tape in cells preceding the head (like in 
Turing machines with two-way infinite tape), 

° aż is a sequence of symbols for i-th tape in cells from the head’s right (like in 
Turing machines with two-way infinite tape). 


In practice, step description will be shown in two ways presented below at exam- 
ples: 


e state of the control unit plays a role of tape markers, contents of tapes is split for 
left and right sequences of symbols and contents of tapes may be shifted each 
other: 


EUROPEAN UNION fe 
EUROPEAN 
SOCIAL FUND 4.2 Nondeterministic Turing machines 49 


Dep 6) CD. Op 69.9.6 
2y2y2y2y2 2y2 
X{X3X5X4X5 4 XGX7 
XPXGXSXG 


e state of the control unit is placed at the beginning, contents of tapes are fixed 
together, heads’ positions are underscored: 
lylylylylyly! 
ee 
q Xi X3X3 XI XS X6 X1 
PER ER ES Cy 


4.2 Nondeterministic Turing machines 


Nondeterministic model is a very important modification of Turing machines. Equ- 
ivalence of deterministic and nondeterministic Turing machines is the most impor- 
tant conclusion drawn from discussion in this Chapter. 

As mentioned before, for any configuration of deterministic Turing machines, at 
most one transition could be done. Unlike, nondeterministic Turing machines allow 
for a choice between several transitions for some (or all) configurations. This me- 
ans that transition function may yield a set of possible transitions. In this section 
we provide a definition of nondeterministic Turing machine in basic model. Then 
we prove equivalence between nondeterministic Turing machines in basic model 
and a multi-tape deterministic Turing machines. A study on equivalence of diffe- 
rent models of nondeterministic Turing machines is similar to analogous study on 
deterministic Turing machines. For that reason we skip over such a discussion. 


Definition 4.17. Nondeterministic Turing machine in basic model is a system 
M = (Q, x, r, ô, q0, B, F, C) 


with components as follows: 


e ô is the transition function: ô : Q x IT > U (Q xT x {L,R})* 
where (Q x T x {L,R})° stands for an undefined value of transition function and 
(Q xT x {L,R})* denotes the set of values of transition function consisting of k 
3-tuples being transition descriptions (p,X,D) € Q xT x {L,R}). 

e descriptions other components are given in Definition 4.1. 


Note that values of transition function are sets of several descriptions of transi- 
tions rather than one description (like in case of deterministic Turing machines), i.e. 
we may have a set of k-elements as a value of the transition function: 


6(q,X) = {(p1,%1,D1), (p2, Y4, D2), --- (Pr, Yk, De) } 


We will informally use a term that transition function yields k values. 


g HUMAN CAPITAL 
m r NATIONAL COHESION STRATEGY 
50 4 Turing machines 


Once the empty set or a one element set {(p,Y,D)} is a value of the transition 
function, then transition is performed like for deterministic machine, i.e.: 


1. if transition function yields the empty set, i.e. it is undefined, then machine falls 
into infinite computation, 

2. for a value {(p,Y,D)} of transition function, the control unit switches to state p, 
the head stores symbol Y and shifts in direction D. 


However, when transition function yields a set of k > 1 values then a transition could 
be explicated as follows: 


3 a choice is made between possible k values, 

4 for the chosen description a transition is made like for deterministic Turing ma- 
chine, 

5 it is assumed that transition chosen in the first point leads to such a computation 
(in the sense of deterministic Turing machines), which terminates in an accepting 
state, if such a computation exists. 


A question can be raised, how to hold the assumption of point 5, when a choice is 
done. Nondeterminism does not answer this question, it just assumes such choices. 


Remark 4.4. The assumption made in point 5 of the above description is the fun- 
damental assumption of nondeterminism. It creates an interpretation of nondeter- 
minism assuming that computation is a sequence of configurations achieved with 
correct choices between possible transitions. 


Remark 4.5. Another interpretation of nondeterminism is quite practical. When 
transition function yields a set of k > 1 values, the machine creates k copies of 
itself. Then, every copy makes a transition matching to one value and continues its 
own computation. The machine accepts if and only if there is such a copy created 
during computation, which terminates its own computation in an accepting state. 


In spirit of the last interpretation, a nondeterministic Turing machine would be 
interpreted as a group of Turing machines. This group includes the original machine 
at the start of computation and might be enlarged during computation. Every ma- 
chine of this group creates copies of itself, as many copies as the number of values 
yielded by the transition function. Then every copy makes a transition, as descri- 
bed above. Note, that every copy is doing computation like a deterministic Turing 
machine. 

Let us notice, that step description of a nondeterministic Turing machine is exac- 
tly the same like for a deterministic one (of the same model). However, notions of 
transition relation and computation must be re-defined for nondeterministic Turing 
machines. 


Definition 4.18. A pair of configurations of a nondeterministic Turing machine is 
in the transition relation if and only if the second configuration can be derived from 
the first one by application of a transition yielded by the transition function, c.f. 
Definition 4.3. 


EUROPEAN UNION re 
EUROPEAN 
SOCIAL FUND 4.2 Nondeterministic Turing machines 51 


Definition 4.19. A computation of a Turing machine M = (0,T',2,6,q0,B,F,C) is 
a tree such that: 


its nodes are labelled by configurations of the machine, 
its root is labelled by the initial configuration, 
for any node np, its every child ne is related to it in the transition , relation, i.e. 


Np > Ne- 


Note that a computation tree is a k-tree, where k is the maximal number of va- 
lues yielded by the transition function for given arguments. We say that degree of 
nondeterminism is k. 


Computation of a nondeterministic Turing machine is a tree, which may have 
finite paths form the root to leaves as well as infinite paths beginning in the root. 
Interpreting a nondeterministic Turing machine as a group of its copies, we can 
associate finite paths with corresponding copies of the machine. Based on this inter- 
pretation we can say that a nondeterministic Turing machine accepts its input if and 
only if there is a copy of the machine, which terminates its computation in an acting 
state. 


Definition 4.20. A nondeterministic Turing machine accepts its input if and only 
if the computation tree has a path from the root to a leaf corresponding to such a 
configuration, for which the control unit is in an accepting state. 


Proposition 4.8. Nondeterministic Turing machines are equivalent to deterministic 
Turing machines. 


Proof. We will show that for a given Turing machine of one type, an equivalent 
machine of another type can be designed. 

Note that a deterministic Turing machine in basic model is also a nondetermi- 
nistic one (with the maximal number of values yielded by transition function not 
greater than 1). For that reason and due to equivalence of different types of Turing 
machines, any deterministic Turing machine is equivalent to some nondeterministic 
one. 

Now, for a given nondeterministic Turing machine in basic model an equivalent 
deterministic multi-type Turing machine will be built. 

Let us present an idea of construction of a deterministic Turing machine equiva- 
lent to a nondeterministic one. The idea is based on a deterministic simulation of 
a computation of a nondeterministic Turing machine. It can be briefly presented as 
follows: 


e anondeterministic Turing machine accepts its input if and only if the computa- 
tion three has a node, which terminates computation in an accepting state, 

e a computation tree is a k-tree, where k is the maximal number of values yielded 
by the transition function, 

e abreadth-first tree searching algorithm, which starts searching from the root and, 
then, visits nodes level by level, will eventually visit a node, which terminates 
computation in an accepting state, 


g HUMAN CAPITAL 
A á NATIONAL COHESION STRATEGY 
52 4 Turing machines 


e a simulation of a computation of a nondeterministic machine is rooted in the 
breadth-first searching. This simulation investigates nodes visited by breadth- 
first search. For a visited node the computation from the root to this node is 
reproduced. If this computation terminates in an accepting state, then simulation 
is terminated with acceptation. Otherwise, breadth-first searching is continued 
and a next node is investigated. 


Note that the above method replicates transitions many times: closer a node to 
the root, more replications is done for transitions of the path from the root to this 
node. This method has huge complexity of computation. However, a more efficient 
deterministic method to simulate nondeterminism is not known. 

The above idea can be realized by a 3-tape deterministic Turing machine. Let 
us assume that degree of nondeterminism of a simulated nondeterministic Turing 
machine is r. The simulation algorithm could is briefly described as follows: 


an input of a nondeterministic machine is stored on the first tape, 
the simulating deterministic machine systematically generates all possible se- 
quence of length being successive natural numbers 0,1,2,.... Natural numbers 
from the interval [1,2,...,r] are elements of generated sequences. Every genera- 
ted sequence is stored on the second tape, 

e for every generated sequence the following computation is done: 


— the input is copied from the first tape to the third tape, 
— apath from the computation of nondeterministic Turing machine is simulated. 
The path begins in the root of the computation tree and is defined by the 


sequence (i1,i2,...,in): 
length of the path is equal to length of the sequence, i.e. n transitions are 
simulated, 


the successive / — th transition is defined by the / — th value of the transition 
function of the nondeterministic Turing machine. If the transition function 
yields less than k values for given arguments and the / — th value has no 
corresponding transition, than simulation is broken and the algorithm goes 
to generating and investigating a next sequence of numbers, 

— if, for given generated sequence, a node terminating computation of the non- 
deterministic Turing machine is reached, then simulating machine stops its 
computation and accepts the input. Otherwise, the algorithm goes to genera- 
ting and investigating a next sequence of numbers. 


The above simulation algorithm describes a deterministic Turing machine, which 


is equivalent to the simulated nondeterministic Turing machine. 


4.3 Linear bounded automata 


We can observe that many Turing machines can exploit during computation only this 
part of its tape, which was used to store input data. Indeed, such Turing machines are 


EUROPEAN UNION fe 
EUROPEAN 
SOCIAL FUND 4.3 Linear bounded automata 53 


distinguished as a subclass called linear bounded automata. In the book, only linear 
bounded automata, which have the stop property or are equivalent to them, will be 
considered. Such automata will be used to accept context-sensitive languages. The 
formal definition of linear bounded automata is as follows: 


Definition 4.21. A linear bounded automaton in basic model is a Turing machine 
(nondeterministic, in general) with the stop property: 


M = (Q,£,T,ô,qo,B,#, &,F,C) 


with components as follows: 


e #,& are the left and the right guard, #,& ET, #,& ¢ X, 
e #q0a\ a2 ...an& is the initial configuration, where a a2 ... an is input data, 
e the transition function cannot yield a value that allows the head: 


— to replace the left guard symbol with any other symbol or to make a shift left 
when it reads the left guard symbol, i.e. 
ô(q,#) = {(p1,#,R), (p2,#,R),.--, (Pr,#,R)} for any q, pi, p2,- --,Pk E€ Q, 
— to replace the right guard symbol with any other symbol or to make a shift 
right when it reads the right guard symbol, i.e. 
ô(q,&) = {(p1, &, L), (p2,&,L),---; (Pk, &,L)} for any q, p1, p2,---; Pk E Q, 
— to store a guard symbol in any cell besides these holding them, i.e. 
ifX eT — {#,&}andq €Q, 
then 6(q,X) = {(p1;Y1;,D1), (p2,¥2,D2),---, (Pr Yk, De)}, 
where Y1, Y2,..., Yp ET — {#,&}, pi, p2,---, pe E Q and 
D1ı,D2,...,Dp € {L,R}, 


e other components are as in Definition 4.1. 


The discussion on varieties of Turing machines could be adapted to analysis of 
linear bounded automata. However, not all types of Turing machines have coun- 
terparts of linear bounded automata. For instance, Turing machines in basic model 
or with two-way infinite tape do not have equivalent linear bounded automata. On 
the other hand, the idea of guard furnishing Turing machines is common with the 
definition of linear bounded automata, though the later ones have more restrictions 
than former ones. Such ideas utilized for Turing machines as: halting states and a 
multi-track tape may be directly adjusted to linear bounded automata. Also, the idea 
of multi-tape is adaptable to linear bounded automata in sense that all tapes have 
length equal to input data and have the left and the right guard. 

There is a model, which raised the name of linear bounded automata. In this 
model length of the tape is bounded by a linear function of the length of the input 
data or, equivalently, is a multiply of the length of the input data. In these automata, 
the beginning and the end of tape is marked by guards, as in basic model. 


Proposition 4.9. The following classes of deterministic linear bounded automata 
are equivalent: 


g HUMAN CAPITAL 
è á NATIONAL COHESION STRATEGY 
4 Turing machines 


in basic model, 

with halting accepting state, 

with the length of the tape bounded by a linear function, 
with a multi-track tape, 

a multi-tape 


The following classes of nondeterministic linear bounded automata are equiva- 


lent: 


in basic model, 

with halting state, 

with the length of the tape bounded by a linear function, 
with a multi-track tape, 

a multi-tape. 


The same classes of nondeterministic linear bounded automata are equivalent. 


Proof. A proof of equivalence of these classes is quite similar to proofs of equiva- 
lence of respective classes of Turing machines. 


Chapter 5 
Pushdown automata 


Pushdown automata varies form Turing machines in their definition and interpre- 
tation. However, despite differences, we will prove in this Chapter, that pushdown 
automata are limited Turing machines. Pushdown automata are finite structures with 
a stack, which is a potentially infinite element. A stack is a data structure, also cal- 
led LIFO, i.e. ’ last in, first out”, which allows for storing abstract elements of data 
and removing them. Usually two operations are used for operating a stack: push and 
pop. The push operation adds an element to the stack hiding elements previously 
or initializes the stack, if it is empty. The pop operation removes and returns the 
element most recently added to the stack or returns the empty value if the stack is 
empty (this is why stack is also called LIFO structure). Note that there elements be- 
sides of the stack are not accessible except the one on the top of the stack. In order 
to get access to a requested element formerly pushed on the stack, it is necessary to 
pop all elements pushed later than the requested one. In other words, elements are 
removed from the stack in the reverse order to the order of their addition. Thus, stack 
data accessibility is significantly limited comparing to tapes of Turing machines. As 
a result, pushdown automata are less powerful than Turing machines. 


5.1 Nondeterministic pushdown automata 


We discuss nondeterministic pushdown automata first. Deterministic ones should 
fulfill conditions clearer on the basis of general definition of nondeterministic ones. 


Definition 5.1. A pushdown automaton is a system 
M = (Q,2,T, 6,40; >,F,R) 


with components as follows: 


Q -afinite set of states, 
I. -afinite set of stack symbols (stack alphabet), 


55 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
56 5 Pushdown automata 


> -an initial stack symbol, > € I, 

2 -afinite input alphabet, 

qo -the initial state, go € Q, 

F -a set of accepting states, F C Q, 

R -aset of rejecting states, R C Q, FAR = Ø, 

ô -atransition function, ô : Q x ([U{e}) x > UŽo(Q x T*) 


heads 
> Zi Z> aa Zin aj | ag]... | an < 
the stack the input tape 


q the control unit 


Fig. 5.1 Pushdown automaton. 


A pushdown automaton could be interpreted as a physical mechanism shown in 
Figure 5.1.This mechanism consists of: 


a control unit, it is in a state of q € Q, 
an input tape holding input data a) a2 ...an, the special end-of-input symbol < 
is attached to input data, the symbol < neither belongs to the input alphabet, 
nor to the stack alphabet, it marks the end of input easing practicing pushdown 
automata, 

e an input head, which reads an input symbol and shifts right or does not take any 
action, 

e astack, which is a one-way infinite tape with stack structure, which gives access 
only to its top cell, i.e. its first cell, 

e astack head, which can perform push and pop operations. 


A pushdown automaton is aimed on accepting a language, i.e. on answering if its 
input word is in the language or not. Computation of a given pushdown automaton 
is done according to the following intuitive procedure: 


1. the initial configuration of a given pushdown automaton is described as follows: 


a. an input data, a word w = a;a2...a, over input alphabet X, is stored on the 
input tape, c.f. Figure 5.1, 

b. an end-of-input symbol < is attached to the input data, 

c. the head of the input tape is placed over the first (leftmost) symbol of the input 
word, 

d. the head of the stack is (always) placed over the top cell of the stack 

e. the control unit is in the initial state go, 


EUROPEAN UNION 


EUROPEAN 


RERERENG cea 5.1 Nondeterministic pushdown automata 57 


2. if the input head reads the end-of-input symbol < and the control unit is in a 
accepting or rejecting state, then computation is terminated and the automaton 
responds its state, i.e. if a state of its control unit is an accepting one or a rejecting 
one, 

3. if conditions of point 2 are not satisfied, then based on 


a. 
b. 
c. 


a state q of the control unit, 
a symbol X read by the head of the stack, 
either an input symbol a or without it, 


the automaton makes the following actions: 


d. 


h. 


values { (p1, 01), (p2, Q2),---, (Pr, Œk) } of the transition function ô (q,a, X) or 
ô(p,€,X) are computed, where a € X is the input symbol, X € T is the top 
symbol of the stack, g,p1,p2,..-, Pe E Q are states, O1,00,...,0, €I* are 
strings of symbols of the stack alphabet, 

if values of the transition function are computed based on an input symbol 
6(q,a,X), then the input head is shifted right, otherwise the input head does 
not change its position, 

a value (p;,;) of the transition function is chosen nondeterministically, 

the top symbol X of the stack is removed and then symbols of the string œ; 
are pushed on the stack in reverse order, i.e. the last symbol first, the first one 
last. The first symbol of œ; will be on the top of the stack after this operation, 
the control unit switches to the state p;, 


4. computation goes to the point 2. 


We assume that pushdown automata always terminate computation. The value 
(Q x T*)? of a transition function, which is the empty word g, is interpreted as 
termination of computation in a rejecting state. 

Note that pushdown automata cannot store output information, so - unlike Turing 
machines - they can only accept languages and cannot compute functions or solve 
problems. 

Now we define step a description, a transition relation and a computation of pu- 
shdown automata. 


Definition 5.2. A step description (a configuration) of a pushdown automaton M = 
(Q,2,0',5,q0,,F,R) is the following sequence of symbols: 


Yaw 


where: 


e q €Q is the current state of the control unit of a pushdown automaton, 
e yis the stack contents, the last symbol of y is the top symbol of the stack, 
e q is the current input, the first symbol of w is the symbol under the input head. 


Note that both y and @ sequences of symbols are words over the stack alphabet 
T and the input alphabet £ and that any of these sequences may include only the 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
58 5 Pushdown automata 


initial stack symbol > and the end-of-input symbol <. However, none of these two 
sequences can be infinite, like in a case of a step description of a Turing machine. 

For instance, the initial step description (configuration) is of the form > gow <, 
where qo is the initial state, w is the input data. In this case the stack is empty, so 
the initial stack symbol is near the initial state symbol. On the other hand, a step 
description y q < informs that input symbols have already been read. 

Let us analyze transitions done by pushdown automata. We assume that a step 
description is described by the following sequence of symbols: 


D> Xi Xo... Xm-1 Xm q Ai i41 ... an < 


Recall that: 


q is the current state of the control unit, 

D> Xı X2 ... Xm-1 Xm is the sequence of symbols on a stack, Xn is the top symbol 
of a stack, 

dj Ai+1 ... An < is the sequence of input symbols, a; is the front input symbol 
the head of a stack reads Xm, 

the input head either reads a;, or does not read anything. 


The next step description is determined by the transition function: 


e if the value of the transition function is 6(q, ai, Xm) = {(p1,€), (p2,¥x Y2 Y3)}, 
for instance, and the second value (p2, Y; y? ¥ yh is chosen nondeterministically, 
then the following step description is yielded, 


D> Xi X ... Xm-1 We YZ Y} padi ... an < 


e if the value of the transition function is 5(g, €, Xm) = {(p1,€), (p2, Yà YF Y3)}, 
for instance, and the first value (p1,£€) is chosen nondeterministically, then the 
following step description is yielded: 


> Xi X ... Xm—1 P1 Gi Gi41 ... An < 


We will use the symbol > to denote a transition of a pushdown automaton. The 
transition symbol > may be supplement with a pushdown automaton name > y to 
emphasize that a transition concerns a given pushdown automaton. It also can be 
supplemented with a superscript >“ to notify k transitions done. 

The above two transitions done by a pushdown automaton will be denoted as 
follows: 


3y2yl 
DX... Xm-1 Xn q di işı ... Qn I > DX... Xm—1 Yo Yo Yo pr Gin ... an < 
> Xı ... Xm-1 Xm q Gi Gin -.. anI > OX... Xm-1 P1 ai i41 .-- Ay < 
Definition 5.3. Transitions of a pushdown automaton create a binary relation in the 


space of all possible configurations of the automaton, i.e. any two configurations 
are related if and only if the second is derived from the first one by application of a 


EUROPEAN UNION re] 
EUROPEAN 
SOCIAL FUND 5.2 Deterministic pushdown automata 59 


transition function. This relation is called the transition relation of a given pushdown 
automaton. The transitive closure of the transition relation is denoted ~*. 


Definition 5.4. A computation of a pushdown automaton is a tree such that: 


its nodes are labelled by configurations of an automaton, 
its root is labelled by the initial configuration, 
for any node np, its every child ne is related to it in the transition, relation, i.e. 


Np > Nc- 


Note that a computation tree is a (k+1)-tree, where k is the maximal number of 
values yielded by the transition function for given arguments. Note that a transi- 
tion for a given state and a given stack symbol can be chosen form k-transitions 
respective to an input symbol and one transition done without checking an input. 


Now we give a formal definition of acceptation of an input by a pushdown auto- 
maton. 


Definition 5.5. A pushdown automaton accepts its input if and only if the pair of 
the initial configuration and a final configuration belongs to transitive closure of a 
transitive relation i.e. nı >* Nr, where 1 is an initial configuration and np is a final 
configuration. 


Remark 5.1. A pushdown automaton accepts its input if and only if the computation 
tree has a path from the root to a leaf labelled by an accepting configuration. 


Based on the above discussion we give now formal definitions of some concepts. 


Definition 5.6. The language accepted by a pushdown automaton is the set of words 
w € &* accepted by a pushdown automaton. 


5.2 Deterministic pushdown automata 


A pushdown automaton is a deterministic one if there is no more than one possible 
transition in any configuration. This condition needs that, for any configuration of 
an automata, a transition function yields at most one possible transition. However, 
for push down automata, there is another nondeterministic factor: for a given state 
and a given stack symbol there might be a choice between transitions for given input 
symbols and without involving an input symbol (€-symbol). As a result, the follo- 
wing definition formulates conditions for a pushdown automata to be a deterministic 
one. 


Definition 5.7. A pushdown automaton M = (Q,2,I',6,q0,>,F,R) is a determini- 
stic one if and only if: 


e its deterministic function for any arguments, i.e. for any triple 
(q,X,X), qEQ, X € (LU{E,<}), X ET, yields at most one transition, 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
60 5 Pushdown automata 


e for given q € Q and X ET the transition function rejects for (g,€,X) or rejects 
for all (q,a,X), a€ (LU {e}). 


Proposition 5.1. Deterministic pushdown automata are not equivalent to nondeter- 
ministic ones. 


5.3 Accepting states versus empty stack 


Examples presented in previous sections show that acceptation came with the initial 
symbol of the stack > and the end-of-input symbol <. A question may be asked, if 
this is a coincidence or rather a rule, that we can change acceptation by accepting 
state to acceptation by empty stack (empty input is default, because we already 
assumed that acceptation must be accompanied with empty input). This question 
can be answered positively. We may move up a class of pushdown automata, which 
accept when the stack is empty, and prove that automata accepting by empty stack 
are equivalent to automata accepting with a state. An idea of a proof is quite clear. 
On one hand, having a pushdown automaton accepting with a state, we can empty 
its stack when an accepting state is reached and then accept its input. On the other 
hand, given a pushdown automaton accepting with an empty stack, it should make 
additional transition to an extra accepting state, when its stack is empty. Details are 
given below. 


Definition 5.8. A pushdown automaton accepting by empty stack is a system 
M= (0,2,T',6,q0,>,2,R) 


where the set of accepting states is empty and other components are as shown in 
Definition 1. 


Note that acceptance by empty stack does not depend on a state of a termina- 
ted automaton description. Because of this the set of accepting states of such an 
automaton is empty. 


Proposition 5.2. Acceptance by states is equivalent to acceptance by empty stack. 


Proof. We prove that for any pushdown automaton accepting by states there exists 
an equivalent automaton accepting by empty stack and oppositely. 
Assume that there is a pushdown automaton accepting by states 


M= (Q,2,I',6,q0,>,F,R) 


The following pushdown automaton accepting with empty stack is equivalent to the 
above one 
M' = (O° 2F"; ô’, do, >’, Ø,R) 


where: 


EUROPEAN UNION fel 
EUROPEAN 
SEND 5.3 Accepting states versus empty stack 61 


e OF =QU{q,q} and ON {90:9} =A, 
e I’=FuU{p’} and rN{>'} = Ø, 
e the transition function 6’ is design as follows: 


= eae {(qo,>>')}, 
6'(q,a,X) = 6(g,a,X) forge Q,a € £, X ET, 
sE, X) = ô(q,£,X) fr q E (Q—-F),X ET, 
— ô'(q,£€,X) = {(q',€)} for q E€ F, X €T — {œ} start emptying the stack, 
ô'(q' €,X) = {(q',€)} for X €T — {>} continue emptying the stack, 
— ô'(q',<,œ>) = {(q',€)}, remove the end-of-input symbol and the initial stack 
symbol, accept with empty stack and empty input, 
— M' rejects for all other configurations. 


The automaton M’ goes to the initial configuration of M in its first transition, c.f. 
the first point. Then, it follows computation of M. When M reaches an accepting 
configuration (recall, that it terminates computation for an accepting state), then M’ 
pops a top symbol from its stack and switches to extra state q’ to empty the stack. 

Now, we design a pushdown automaton accepting by states equivalent to a given 
one accepting by empty stack. Assume that there is a pushdown automaton accep- 
ting empty stack 

M =(Q,2,I',6,q0,>,2,R 


The following pushdown automaton accepting with states is equivalent to the above 
one: 

= (O' 2", 0 sd; >’, F,R) 
where: 
Q' = QU {qoqa} and ON {90:44} = 2, 
I’=Fu{p’} and rN{>'} = 2, 
F = {qa}, 
the transition function 6’ is design as follows: 
- 8'(q,€,0') = {(40,>D")}, 
— 6'(q,a,X) = 6(q,a,X) forge Q,a€£,X ET, 
5'(q,€,X) = 6(4,€,X) for q € Q, X € (T —{>}), 


- ô'(q,£,>) = {(qa,€)} for q € F, 
M' rejects for all other configurations. 


The automaton M’ goes to the initial configuration of M in its first transition, c.f. 
the first point. Then, it follows computation of M. When M comes to its stack, then 
M' pops a top symbol from its stack (it is the initial symbol of M) and switches 
(nondeterministically) to the accepting state g4 and accepts if its input is empty. 
Otherwise, when input is not empty, M’ continues simulation of M. 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
62 5 Pushdown automata 


5.4 Pushdown automata as Turing machines 


Pushdown automata are restricted Turing machines. Moreover, since pushdown au- 
tomata always terminate their computation, they are restricted Turing machines with 
the stop property. 


Proposition 5.3. There is a Turing machine with the stop property equivalent to a 
given pushdown automaton. 


Proof. Let us design a Turing machine equivalent to a given pushdown automaton 
accepting with states: 
M = (Q,2,I',5,q0,>,F,R) 


We design a 2-tape Turing machine with terminating states, which is equivalent to 
the automaton M: 


Mr = (Or,2,I7, IF, 6r,B, {qa}, {qR}) 


where: 


e Or =(Q-—(FUR))U {qa,qr}UQs, ON ({Ga,9r}U Qs) = Ø - the set of states 
of the Turing machine Mr includes: 


— states of the automaton M except accepting and rejecting states (except states, 
which terminate computation of the automaton), 

— anew halting accepting state and a new halting rejecting state of the Turing 
machine Mr and 

— additional states, which simulate transitions of the automaton M, 


e T} =ZU{<,B} - an alphabet of the first tape, 
° r? =T U{œ,B} - an alphabet of the second tape, 
e B- the blank symbol of both tapes. 


A configuration of a pushdown automaton is characterized by the following con- 
figuration of a Turing machine, c.f. Figure 5.2: 


e an input is stored on the first tape, the head of the first tape is placed over the first 
input symbol, 

e astack is stored on the second tape, the initial stack symbol is the leftmost non- 
blank symbol of the tape, a top stack symbol is a rightmost nonblank tape symbol, 
the head of the tape is placed on a rightmost nonblank symbol (a top symbol of 
the stack), 

e the control unit of the Turing machine is in the same state as the control unit of 
the pushdown automaton. 


All values of the transition function 6 of the automaton are enumerated. Na- 
mely, if the transition function 6 yields k transitions for a given triple of arguments: 
5(q,a,X) = {(P1,01), (p2, o2), eee) (Pk, Ox) } ô(q,a,X), then every pair (Pi, Qi) gets 
its own number. 


EUROPEAN UNION 
EUROPEAN 


ENS Ea 5.4 Pushdown automata as Turing machines 63 


heads 
> Xı Xə oe Xun a, | ag]. | ap < 
the stack the input tape 
q the control unit 
the tape no 1 .. | B |a | ag | see Aa | <I | B | ies | 
the head no 1 
VB IX Xo- Xa BI the tape no 2 
the head no 2 j 
q the control unit 


Fig. 5.2 A push down automaton and its characterization by a Turing machine. 


A transition of the automaton (p,œ) € ô(q,a,X) with a given number f and 
p€ Q-(FUR),a€ZU{<} and X €T is simulated by the Turing machine as 
follows: 


bus BR j a 

e if œ is the empty sequence, then (z. B’ r) Eô (a xy} 

e if œ = XıX2...X,, then the set of states Qr is expanded by adding states 
Pi P5,- --, pi. and the following transitions are included: 


3 
Y 
5 


A simulation of a transition (p,a@) € 6(q,€,X) is similar to simulation of a trans- 
ition (p,@) € 6(q,a,X) with the following changes: 


e if œ is the empty sequence, then (>. : ; J Ee (a a 


e ifa=X|X...X,, then € ô’ (39 = darat 


In transitions shown above, if p € F, then it should be replaced with q4 and if 
p ER, then it should be replaced with gr. 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
64 5 Pushdown automata 


The above construction of a Turing machine simulating a pushdown automaton 
shows that the Turing machine has the stop property and it accepts an input if and 
only if the pushdown automaton accepts the same input. 


Chapter 6 
Finite automata 


Finite automata are the simplest model of computation. They can be derived from 
pushdown automata by removing stack. Finite automata are finite structures without 
any potentially infinite element as, for instance, a stack in pushdown automata or a 
tape in Turing machines. Despite of these limitations, finite automata are important 
theoretical and practical tools. Three classes of finite automata are distinguished: 
deterministic, nondeterministic and those with €-transitions. 


6.1 Deterministic finite automata 


Deterministic model of finite automata is the simplest one among three types: de- 
terministic, nondeterministic and with €-transitions. It is the simplest one in terms 
of a description of automata as well as of computation realized for a given input 
word. From now on the term finite automata will denote deterministic finite au- 
tomata. Any reference to nondeterministic finite automata or finite automata with 
€-transitions will be explicitly acknowledged. 


Definition 6.1. A deterministic finite automaton is a system 
M = (O,2, 6,90,F) 


with the following components: 


Q - a finite set of states, 

» - a finite input alphabet, 

qo - the initial state, go € Q, 

F - a set of accepting states, F C Q, 

ô - a transition function, ô : Q x E —> Q. 


A transitions function of a finite automaton is a total function, i.e. it is defined for 
all its pairs of arguments (a transition function of Turing machines and pushdown 
automata could be undefined for some arguments). 


65 


66 


g HUMAN CAPITAL 
m A NATIONAL COHESION STRATEGY 
6 Finite automata 


a [an |a; | as | | [an 
an input word 
the head 


the control unit 


Fig. 6.1 A deterministic finite automaton. 


A finite automaton could be interpreted as a physical mechanism shown in Fi- 


gure 6.1. This mechanism consists of: 


a control unit, it is in a state of q € Q, 
an input tape holding input data a)a2...dy, 
an input head, which reads an input symbol and shifts right. 


Like in case of pushdown automata, a finite automaton is aimed at accepting 


a language, i.e., answering if its input word belongs to the language or does not. 
Computation of a given finite automaton is done according to the following intuitive 
procedure: 


1. 


the initial configuration of a given finite automaton is described as follows: 


a. an input data, a word w = d)a2...d, over input alphabet 2, is stored on the 
input tape, c.f. Figure 6.1, 

b. the head of the input tape is placed over the first (leftmost) symbol of the input 
word, 

c. the control unit is in the initial state go, 


based on: 


a. a state q of the control unit, 
b. an input symbol a 


the automaton realizes the following actions: 


c. a state p of the transition function 6(q,a) is computed, where a € X is the 
input symbol and q, p € Q are states, 

d. the input head is shifted right, 

e. the control unit switches to the state p, 


. if the input is not empty, then computation goes to the point 2, otherwise, com- 


putation is terminated and an automaton responds a state of the control unit. 


Of course, computation of a finite automaton always terminates. 
Note that finite automata, like pushdown automata, cannot store output infor- 


mation, so they can only accept languages and cannot compute functions or solve 
problems. 


EUROPEAN UNION a 
EUROPEAN 
SOCIAL FUND 6.1 Deterministic finite automata 67 


Now we define a step description, a transition relation and a computation of finite 
automata. 


Definition 6.2. A step description (a configuration) of a finite automaton 
M = (Q,*,6,q0,F) is the following sequence of symbols: 


q © 


where: 


e q €Q is the current state of the control unit of a finite automaton, 
e q is the current input, the first symbol of @ is the symbol under the input head. 


Note that a sequence of symbols œ is a word over the input alphabet £. It is 
empty after termination of a computation. It never can be infinite. 

For instance, the initial step description (configuration) is goa) a2 ... dn, where 
qo is the initial state, w = a) a2 ... an is the input data. On the other hand, a step 
description q informs that the control unit of a finite automaton is in a state q and an 
input symbols have already been read. 

As in case of Turing machines and pushdown automata, we will use the symbol 
> to denote a transition of a finite automaton. The transition symbol > may be sup- 
plemented with a finite automaton name > jy to emphasize that a transition concerns 
a given finite automaton. It also can be supplemented with a superscript >“ to notify 
that k transitions done. 

If a step description q ajaj+) ...d, and a transition function applicable for this 
step description is 6(g,a;) = p, then a next step description after a transition is made 
iS p Gj+1 ... An. These two step descriptions create a transition of a finite automaton, 
which is denoted as: 


q di Qj+] --. An > P đi+1 ... An 


Definition 6.3. Transitions of a finite automaton create a binary relation in the space 
of all possible configurations of the automaton, i.e. any two configurations are rela- 
ted if and only if the second is derived from the first one by application of a transition 
function. This relation is called the transition relation of a given finite automaton. 
The transitive closure of the transition relation is denoted by >* 


Definition 6.4. A computation of a finite automaton M = (Q,2,6,q0,F) is a sequ- 
ence of configurations 1,,12,-.-,1, such that N is the initial configuration, Nr is 
the final configuration and a pair of any two successive configurations belongs to 
the transition relation. A computation is denoted as nı > N2 >... > Nn. 


Remark 6.1. A computation of a finite automaton is a finite sequence of configura- 
tions 


Jo 41 4243 ... An ™ Gj, A243... An ™ qi, 43 ---An > Gi, On ®© in 


g HUMAN CAPITAL 
m a NATIONAL COHESION STRATEGY 
68 6 Finite automata 


Because any transition consists of reading an input symbol and switching to a state, 
a computation will be shown in a simpler form 


qo 41 qi; 42 fi, 43 +--+ An in; An in 
Now we give a formal definition of acceptance of an input by a finite automaton. 


Definition 6.5. A finite automaton accepts its input if and only if the pair of the in- 
itial configuration and a final configuration belongs to transitive closure of a trans- 
itive relation i.e. nı >* Nr, where 1, is an initial configuration and 177 is a final 
configuration. 


Based on the above discussion we give now formal definitions of some concepts. 


Definition 6.6. The language accepted by a finite automaton M = (Q,2,6,q0,F) is 
the set of words w € X* accepted by a finite automaton. 


A transition relation identified in Definition 6.3 yields exactly one state for a 
given configuration of a finite automaton. Namely, for a given state q and a given 
symbol of an input alphabet a there is exactly one state p related to the given state q. 
The state p is yielded by a transition function: p = 6(q,a). This property comes from 
the definition of a transition function, which is total and its value is a state: 6 : Q x 
£ — Q. In other words, for a given state, exactly one state is related to it with regard 
to a given symbol of an input alphabet. The transitive closure of a transition relation 
is a function as well, i.e. for a given state, exactly one state is related with regard 
to a given sequence of symbols of an input alphabet. This property is exploited in a 
definition of a so called closure of transition function. 


Definition 6.7. A closure of a transitions function 6 of a given deterministic finite 
automaton M = (Q,2,6,qo,F) is the function 6 satisfying the following conditions 


A 


1.8:0xE* >Q, 
2. (Yq E€ Q)Ò(4,£) =q, : . 
3. (Vq E Q)(Ya € L)(w € L*)0(g, wa) = 6(6(q,w),a). 


The closure of a transition function of a deterministic finite automaton extends a 
domain of a transition function form an input alphabet of an automaton to the set of 
all words over an input alphabet. The closure of a transition function applied to an 
initial configuration of a finite automaton immediately yields a result of computation 


A 


Ô(qo,w) of this automaton for the initial state qọ and an input word w = aja? .. . an. 


Remark 6.2. The restriction of the closure 6 of a transition function ô to the domain 
of ô is equal to ô: 


A 


ô =ô 


Qxz 


For that reason both a transition function of a deterministic finite automaton and its 
closure are denoted by the same symbol 6 unless it might result in some misinter- 
pretation. 


EUROPEAN UNION fe) 
EUROPEAN 
FOCIAEEUND 6.2 Nondeterministic finite automata 69 


6.2 Nondeterministic finite automata 


In the previous section we have studied deterministic finite automata. In this sec- 
tion we introduce and discuss nondeterministic finite automata. Nondeterministic 
finite automata compared to their deterministic counterparts differ in a form of the 
transition function. A transition function of nondeterministic finite automata allows 
for choosing a transition among several states. This new feature simplifies solutions 
of problems, though it does not increase computational abilities of finite automata. 
Below, we will prove that both classes of automata, i.e. deterministic finite auto- 
mata and nondeterministic finite automata, are equivalent with regard to nature of 
the accepted languages. This means that every of these two classes of finite auto- 
mata accept the same class of languages. In other words, for a given automaton 
of one class we can construct an equivalent automaton of another class. The proof 
of equivalency of both classes of automata is a constructive one, what means that 
it formulates a method of construction of a deterministic finite automaton, which is 
equivalent to a given nondeterministic one. Of course, a trivial opposite construction 
is also mentioned. 


Definition 6.8. A nondeterministic finite automaton is a system 
M = (O,2, 6,q0,F) 


where: 


e ô- a transition function, ô : Q x E — 22, 
e Q, X, qo, F are the same as given in Definition 6.1. 


A transition function of nondeterministic finite automata is a total function, the 
same as for deterministic finite automata, i.e., it is defined for all pairs of arguments. 
The values of a transition function are subsets of a set of states including the empty 
set (which is a subset of the set of all states). 

An interpretation of a nondeterministic finite automaton is as it is shown in Fi- 
gure 6.1. 

A definition of a configuration (step description) of a nondeterministic finite au- 
tomaton is identical with the definition of a configuration of deterministic finite 
automata, c.f. Definition 6.2. 


Remark 6.3. Like in case of deterministic finite automata, nondeterministic finite 
automata are aimed at accepting a language. The computation of a given nondeter- 
ministic finite automaton is done according to the following intuitive procedure: 


1. the initial configuration of a given finite automaton is described as follows: 


a. an input data, a word w = a142 ...an over input alphabet £, is stored on the 
input tape, c.f. Figure 6.1, 

b. the head of the input tape is placed over the first (leftmost) symbol of the input 
word, 


g HUMAN CAPITAL 
k a NATIONAL COHESION STRATEGY 
70 6 Finite automata 


c. the control unit is in the initial state go, 


2. based on: 


a. a state q of the control unit, 
b. an input symbol a, 


the automaton realizes the following actions: 


c. a set of states {p1, p2,...,px} of the transition function 5(g,a) is computed, 
where a € & is the input symbol and q, p1,p2,.-.., pk E Q are states, 

d. if the set yielded by a transition function is empty then computation is termi- 
nated and an input is rejected, otherwise 

e. astate p; is nondeterministically picked up from the set of states {p1, p2,..., px}, 

f. the input head is shifted right, 

g. the control unit switches to the state pj, 


3. if the input is not empty then computation goes to the point2, otherwise compu- 
tation is terminated and an automaton responds with a state of the control unit. 


Of course, computation of a nondeterministic finite automaton always terminates, 
as we have seen for deterministic finite automata. 

The symbols >, >m and >“ are used in usual way, refer to our discussion on 
deterministic finite automata. 


Definition 6.9. Transitions of a nondeterministic finite automaton form a binary re- 
lation in the space of all possible configurations of the automaton, i.e. any two con- 
figurations are related if and only if the second is derived from the first one by a 
transition in terms of point 2. of Remark 6.3. This relation is called the transition 
relation of a given nondeterministic finite automaton. The transitive closure of the 
transition relation is denoted by >*. 


Definition 6.10. A computation of a nondeterministic finite automaton 
M= (0,2,5,q0,F) 


is a tree such that: 


e its nodes are labelled by configurations of an automaton, 
e its root is labelled by the initial configuration, 
e for any node np, its every child 1c is related to it in the transition, relation, i.e. 


Np > Nc- 


Remark 6.4. Note that a computation tree of a nondeterministic finite automaton is 
a k-tree, where k is the maximal number of values yielded by the transition function 
for given arguments. Moreover, edges of every level of such a tree are labelled with 
the same input symbol. 


EUROPEAN UNION S| 
EUROPEAN 
POSED 6.2 Nondeterministic finite automata 71 


Remark 6.5. Formal definitions of acceptance of an input and of a language accepted 
by a nondeterministic finite automaton are identical with the respective definitions 
for deterministic finite automata, c.f. Definition 6.4 and Definition 6.6. 


Remark 6.6. A nondeterministic finite automaton accepts its input if and only if the 
computation tree has a path from the root to a leaf labelled by an accepting configu- 
ration. 


An algorithm realized by the automaton: 


e the automaton stays in the state go reading input symbols until a sequence of 
three successive Os or three successive 1s arrives, 

e when the beginning of a sequence of three successive Os or 1s is nondetermini- 
stically encountered, a transition is made to the state q10 or q11, respectively, 

e next two consecutive Os or 1s are counted by transitions to states g29 and q4 or to 
states q21 and qa, 

e when the state qa has been reached, computation stays in this state for next co- 
ming input symbols, 

e input words, for which the accepting state q4 is reached, are accepted and only 
such words. 


The transition relation of nondeterministic automata yields a subset of a set of 
states for given configuration (unlike the transition relation of deterministic auto- 
mata, which yields exactly a single state). That is, for a given state g and for a given 
symbol a of an input alphabet, states p € P C Q are related to the given state g. This 
property comes from the definition of a transition function, which is total and its 
values are subsets of the set of states: ô : Q x E — 22. The transitive closure of a 
transition relation relates as well subsets of the set Q for a given state and a given 
symbol of an input alphabet. The following definition provides details of a so called 
closure of transition function for nondeterministic automata. 


Definition 6.11. A closure of a transitions function 6 of a given nondeterministic 
finite automaton M = (Q,2,6,q0,F) is the function: 


1.5:0xr* +22 
2.(VqeQ)d(qe)={at P 
3. (Vq € O)(Va E X)(w € 2*)ò (q, wa) = 8 (ô (q,w),a) 


where the following notation: 6(P,a) = Upep (p,a) for any P € Q stands for 


A 


6(0(g,w),a). 


As is in case of deterministic finite automata, the closure of a transition function 
of a nondeterministic finite automaton extends a domain of a transition function 
form an input alphabet of an automaton to the set of all words over an input alphabet. 
The closure of a transition function applied to an initial configuration of a finite 
automaton immediately yields a result of computation 5(qo,w) of this automaton 
for the initial state gg and an input word w = a1 a2... an. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


72 6 Finite automata 


Remark 6.7. For nondeterministic finite automata, the restriction of the closure 5 of 
a transition ô to the domain Q x X is equal to the transition function 6: 


A 


ô =ô 


QxE 


For that reason both a transition function of a nondeterministic finite automaton 
and its closure are denoted by the same symbol ô unless it may cause misinterpre- 
tation. 


Remark 6.8. An input word w is accepted by a nondeterministic finite automaton 
M = (Q,X,8,q0, F) if and only if 6(go,w)NF # Ø. 


A question is raised whether nondeterministic finite automata are equivalent to 
deterministic ones. The answer is positive. 

On the one hand, a deterministic automaton is such a nondeterministic one, which 
does not exploit nondeterminism. Formally, a deterministic automaton can be turned 
to a nondeterministic one by replacing a value of a transition function of a determi- 
nistic finite automaton, which is a state, by the set including this state. 

Conversely, we will prove that, for a given nondeterministic automaton, there 
exists an equivalent deterministic one, i.e. accepting the same language. An idea 
of construction of such a deterministic automaton is based on observation of com- 
putations of both types’ automata. An attempt is made to turn a computation of a 
nondeterministic automaton, which is a tree of configurations, to sequence of con- 
figurations. States of every level of a computation tree are collected into a set of 
them. In such a way, a tree is twisted to a sequence of alternating sets of states and 
symbols of input alphabet. Such a sequence corresponds to a computation of a de- 
terministic finite automaton supposing that sets of states represent potential states 
of a deterministic finite automaton. 

The details behind these intuitive observations are outlined as follows: 


Proposition 6.1. Nondeterministic finite automata are equivalent to deterministic 
ones. 


Proof. First, a following deterministic automaton: 
M = (Q,£,8,q0,F) 
is equivalent to a nondeterministic one: 
M = (Q,2,6',q0,F) 


such that 6’: Q x E > 22, ô' (q,a) = {8 (q,a) } for q E€ Qanda E€ ZÈ. 
Second, let us assume that now a nondeterministic finite automaton is: 


M= (0,2,6,q0,F) 


An equivalent deterministic automaton is denoted as follows: 


EUROPEAN UNION [| 
EUROPEAN 
SOCIAL EOND 6.2 Nondeterministic finite automata 73 


M' = (Q',2,5',49,F’) 


where: 


e Q' ~22 - states of the deterministic automata correspond to sets of the nondeter- 
ministic one. A state of Q’ corresponding to a set of states {gi,,gi,,--- Gi; } will 
be denotes [q;,,gi,,--- 9i;| just to distinguish sets of states Q from states of Q’, 

© qù = [q0] - the initial state of M’ is the is the set including the initial set of M, 

e FF = { [i dins---5 ij) E O : {qiqi +++ di; } QF A Ø} - accepting states of M’ 
are labelled by sets of states including an accepting state or states of M, 

? 8 (|li Fins -- +1 4ij],4) = [Pi ; Pin -+ Pip] © Uj 5'(qi, 4) = {Pi Pis- Pip) F 
the transition function of M’ takes union of its values (which are sets of states of 
M), for states included in its argument (which corresponds to a set of states of 
M). 


A formal proof of equivalence of M and M’ is based on mathematical induction 
carried out with regard to the length of input word of both automata. 

Let us prove that the closure of transition functions of both automata hold the 
equivalence for any w € 2* (note that the same symbol denotes both a transition 
function and its closure): 


5'([go],w) = [didin a> 997, © ô(qo,w) = Gigs Pins qi;} (*) 


1. for the word of length 0, i.e. for the empty word, the equivalence («): 6’({qo],€) = 
[go] and 5(qo) = {qo} is derived directly from definitions of the closure of trans- 
ition functions, 

2. let us check if the equivalence (x) holds for a word aj a) ... ana = wa. In virtue 
of the inductive assumption we have this equivalence holds for any word w = 
a, a, ... an E &*. Let a € X. Then, based on inductive assumption, we get: 

5 (laol, wa) = 8'(8'((gol.w),a) = 5" (ldi, digs -+4i,),4) and 

5(qo,wa) = 6(8(go,w),a) = Ò ({qi din, sqi; },a) 

for some set of states {qi ;qin;--- sqi} CQ. 

Now, let us compute 6'([qi, ,4i,,---,qi;],@) based on the above definition of ô’: 


5" (ldi + Fin, T ,qi;];a) = [Pi ; qin; -+ Pip], Where 


a ô' (qia) = {Pio Pis- + ‘Diet 


On the other hand 
b({qi, rigy+++ Gi; },4) = ii ô' (qia) = {Pi »Pigs++- Pip} 

3. utilizing mathematical induction based on 1. and 2. we conclude that the equiva- 
lence (*) holds. 


Finally, let us notice that an input word is accepted by the nondeterministic finite 
automaton M if and only if it is accepted by the deterministic finite automaton M’. 
This property comes directly from the equivalence («), Remark 6.8 and definition 
of the set of accepting states F’ of the automaton M’. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


74 6 Finite automata 


6.3 Finite automata with €-moves 


In previous sections of this Chapter we discussed two classes of finite automata: 
deterministic finite automata and nondeterministic finite automata. We proved that 
both classes are equivalent with regard to accepted languages, i.e., for an automaton 
in one class an equivalent automaton of another class can be constructed. The equ- 
ivalence of both classes effectively helps in solving problems because we can choose 
an automaton from a class, which is more appropriate for a problem to be solved. 
Finite automata with epsilon-transitions, €-transitions for short, create a third class 
of finite automata. €-transitions are next tool, which may significantly help to so- 
lve problems. However, €-transitions do not increase computational power of finite 
automata. In further parts of this section an equivalence of finite automata with £- 
transitions with nondeterministic automata is proved. The proof is a constructive 
one, i.e. it develops method of construction of a nondeterministic finite automaton 
that is equivalent to a given finite automaton with £-transitions. A construction of 
a finite automaton with €-transitions that is equivalent to a given nondeterministic 
finite automaton is also shown. 


Definition 6.12. A finite automaton with €-transitions is a system 
M = (0,2, 6,q0,F) 


where: 


e 6 -a transition function, ô : Q x (ZU {e}) > 22, 
e Q, 2, qo, F stay the same as described in Definition 6.1. 


A transition function of finite automata with €-transitions is a total function defi- 
ned for states and for symbols of the input alphabet and the empty word. The values 
of a transition function stay the same as for nondeterministic finite automata; i.e., 
they are subsets of a set of states. Thus, finite automata with €-transitions are non- 
deterministic. 

The difference between nondeterministic finite automata and finite automata with 
€-transitions lays in an ability of the latter one to make a transition without checking 
an input symbol. Let us recall that we already considered a property €-transitions of 
pushdown automata. When an €-transition is made, an input symbols is not checked. 
This means that €-transitions are made only for a given state of a current configura- 
tion. From another point of view, a configuration for €-transition is only a state and 
with no input symbol. 

A definition of a configuration (step description) of a finite automaton with £- 
transitions is identical with the definition of a configuration of deterministic finite 
automata, c.f. Definition 6.2. 

An obvious property €o € = € leads to an interesting idempotent operation on 
states for a given finite automaton with €-transitions 


M= (0,2,6,q0,F) 


EUROPEAN UNION re 
EUROPEAN 
SOCIAL FUND 6.3 Finite automata with ¢-moves 75 


The operation is called epsilon-closure, €-closure for short. It is denoted by € — Cl 
and defined as follows: 


€—Cl(q) ={peQ:p=8"(4,€)} for qEQ 


where: N 
6:02, 6*(q,€) = U ôt (q,£) 
k=0 
and 
5°(q,€) = {a} 
"+! (q,€) = 8(8"(4,£)) 
and 


5(P,X) = (J 6(p,X) for PC Q, X € (ZU {e}) 
pEP 


The € — Cl is obviously idempotent, i.e. 


€—Cl(q)=e-Cl(e-Cl(q))= |J e-Cl(p) 
pee—Cl(q) 


Note that 5*(q,€) = Ugg 6*(¢, €) Uso 8* (q, €), where r not greater than the num- 
ber of states of an automaton. In fact, when a transition diagram of an automaton is 
considered, then r is a longest path starting at q and having edges labelled with €. 


Remark 6.9. Like in case of deterministic finite automata, finite automata with €- 
transitions are aimed at accepting a language. Computation of a given finite auto- 
maton with €-transitions is done according to the following intuitive procedure: 


1. the initial configuration of a given automaton is described as follows: 


a. an input data, a word w = a; a2 an over input alphabet £, is stored on the input 
tape, 

b. the head of the input tape is placed over the first (leftmost) symbol of the input 
word, 

c. the control unit is in the initial state go, 


2. based on a state q of the control unit and on an input symbol a € X: 


a. € — Cl (q) is computed, 

b. a state g’ € € — Cl (q) is nondeterministically picked up, 

c. a set of states {p1, p2,- --, pk} = Ô(q',a) is computed, 

d. if the set yielded by a transition function is empty then computation is termi- 
nated and an input is rejected, otherwise 

. a state p’ € {p1, po,..., pk} is nondeterministically picked up, 

. € — CI (p') is computed, 

. a state p € € — CI (p') is nondeterministically picked up, 

. the input head is shifted right, 


Dona es Oo 


g HUMAN CAPITAL 
i a NATIONAL COHESION STRATEGY 
76 6 Finite automata 


i. the control unit switches to the state p, 


3. if the input is nonempty then computation proceeds to the point 2, otherwise 
computation is terminated and an automaton responds a state of the control unit. 


Like for deterministic and nondeterministic finite automata, computation of a 
finite automaton with €-transitions always terminates. 

The symbol > denotes a transition (of a finite automaton with €-transitions) in 
terms of point 2. of Remark 6.9. The symbols >m and > are used in a usual way, 
c.f. deterministic and nondeterministic finite automata. 


Definition 6.13. A binary relation in the space of all possible configurations of the 
automaton with €-transition is created by transitions of a finite automaton. The sym- 
bol >* denotes the transitive closure of the transition relation. 


Definition 6.14. A computation of a finite automaton with &-moves 
M = (Q,~,6,q0,F) is a tree such that: 


its nodes are labelled by configurations of an automaton, 
its root is labelled by the initial configuration, 
levels created by nodes are distinguished, the root creates the Oth level, children 
of the root create the 1st level etc., 

e nodes of an odd level are yielded by €-closure applied to nodes of the previous 
level, 

e nodes of an even level are yielded by transition function applied to nodes of the 
previous level, 

e the bottom level has an odd number. 


Remark 6.10. Note that like for nondeterministic finite automata, a computation tree 
of a finite automaton with €-transitions is a k-tree, where k is the maximal number 
of values yielded by the transition function for given arguments (input symbols or 
the empty word). Moreover, edges of every level of such a tree are labelled with the 
same input symbol or with the empty word. 


Remark 6.11. Formal definitions of acceptation of an input and of a language accep- 
ted by a finite automaton with €-transitions are identical with respective definitions 
for deterministic and nondeterministic finite automata, c.f. Definition 6.5 and Defi- 
nition 6.6. 


Definition 6.15. A closure of a transitions function 6 of a given finite automaton 
with €-transitions M = (Q, £, ô,qo, F) is the function: 

ô: Q0xE*— 22 
1. (vq € Q)Ê(4,€) = € —Cl( 
2. (Vq € Q) (Va € £) (w € x* 


4) ; 
)5(g,wa) = € —C1(5(5(q.w).a)) 


EUROPEAN UNION S| 
EUROPEAN 
SOCIAL FUND 6.3 Finite automata with ¢-moves 77 


where: 
ô(P,a) = Upep (p,a) and € — CI (P,a) = Upep € — Cl (p,a) is used for P C Q 


The closure of a transition function of a finite automaton with £-transitions 
extends a domain of a transition function form an input alphabet of an automaton 
to the set of all words over an input alphabet. The closure of a transition function 
applied to an initial configuration of a finite automaton immediately yields a result 
of computation 5(qo, w) of this automaton for the initial state go and an input word 
w = a14? ... an. However, a restriction of the closure of a transition function 6 to 
the domain Q x & is not equal to 6. In fact 

D lows =€ —CI((6(€ —Cl(q),a)) 

For that reason a transition function of a finite automaton with €-transitions and 

its closure are denoted by different symbols, ô and 6 respectively. 


Remark 6.12. An input word w is accepted by a finite automaton with €-transitions 
M = (Q,2,6,q0,F) if and only if 6(go,w) NF # Ø. 


A question may be asked whether finite automata with €-transitions are equ- 
ivalent to deterministic ones. The answer is positive. Below we present a proof that 
finite automata with €-transitions are equivalent to nondeterministic finite automata. 
Equivalence with deterministic finite automata comes in the form of a direct conc- 
lusion of Proposition 6.2. 

It is clear that a nondeterministic finite automaton is such a finite automaton with 
€-transitions, which does not use €-transitions. Formally, a nondeterministic finite 
automaton can be turned to a finite automaton with €-transitions by extending a 
domain of its transition function to Q x (X U {€}) and setting a set of the values of 
€-transitions to the empty set. 

On the other hand, we will prove that, for a given finite automaton with £- 
transitions, there exists an equivalent nondeterministic one, i.e. accepting the same 
language. An idea of construction of such a nondeterministic automaton is based 
on analysis of a description of a transition given in point 2 of Remark 6.9. It could 
be concluded that a transition, as described there, corresponds to a transition of a 
nondeterministic finite automata. On the other hand, a transition described there is 
what closure of a transition function for an input symbol yields. Details are given as 
the following: 


Proposition 6.2. Finite automata with €-transitions are equivalent to nondetermi- 
nistic ones. 


Proof. First, a following finite automaton with €-transitions: 
M — (O,2, 6,q0,F) 
is equivalent to a nondeterministic one : 


M = (0,2,6',qo,F) 


g HUMAN CAPITAL 
m a NATIONAL COHESION STRATEGY 
78 6 Finite automata 


such that 6’: Q x (LU {e}) > 22, 8’(q,a) = 6(q,a), 5'(q,€) = Ø 
forg€ Qanda 
Second, let us assume that now a finite automaton with €-transitions: 


M=(Q,Z,6,q0,F) 
An equivalent nondeterministic automaton is denoted as follows: 
M' = (0,Z,6',q0,F’) 


where: 


e =ê lox z - the transition function of M’ is equal to the closure of the transition 
function of M, 
a es ie if €—Cl(qo) VF # Ø 


F otherwise 


A formal proof of equivalence of M and M’ is based on mathematical induction 
with regard to the length of input word of both automata. 

Let us prove that the closure of transition function of both automata is equal for 
any nonempty word w € L* 


5'(go,w) = (qow) C) 


Note that 6’ denotes transition function of M and its closure. However, the transition 
function of M and its closure cannot be denoted by the same symbol. 


1. for words of length 1 the equivalence (*) is derived directly from definition of 
both functions, 
2. let us check if the equivalence (*) holds for a word a; a1 ... ana = wa: 


S(go.wa) = U Spa) = U spa) = 


closure of pes! (qow) inductive p€8(qo,w) definition 
transition assumption of ô! 
function 5! 


= U Stra) =, Êl wa) 
péo(qo,w) closure of 
transition, 
function 6 


3. utilizing mathematical induction based on 1. and 2. we conclude that the equiva- 
lence (*) holds. 


We have just confirmed that both automata M and M’ compute the same set of 
states for given input word w € £*. In fact, this is only a step to finalize the proof, 
i.e. to show that a given input word is either accepted by both automata M and M’, 
or is rejected by them. Let us consider the following cases: 


e transition functions yield the following values for the empty word: 6(qg0,€) = 
€ —Cl(qo) and 6'(qo,€) = {qo}. Thus, € is accepted by M if and only if 


EUROPEAN UNION re) 
EUROPEAN 
SOGIAEEUND 6.4 Finite automata as Turing machines 79 


5(qo,€) QF # Ø. On the other hand, go € F’ if and only if € — Cl (qo) Q F £ Ø. 
Since € is accepted by M’ if and only if go € F’, then the empty word £ is either si- 
multaneously accepted by both automata, or is simultaneously rejected by them, 

e if €—Cl(qo)NF £ Ø or qo € F, then we get that F’ = F. Simultaneous accep- 
tation is derived form the equality (*) and definitions of acceptation of an input 
word by M and M’, 

e if €—Cl(qo) NF £4 @ and qo ¢ F, then for any nonempty word w € E+ we get 
two subcases: 


— if qo ¢ 6'(qo,w), then either both ô'(qo,w) and (qo,w) include an accepting 
state, or none does. Again, a simultaneous acceptation is derived from the 
equality (*) and the definitions of acceptance of an input word by M and M’, 

— the assumption go € 6'(qo,w) of this subcase implies that the word w is accep- 
ted by M’. 5!(qo,w) = 8(go,w) due to (*). We have € —C1(6(qo,w)) AF £ Ø, 
because qo € 5(go,w). Let us assume that w = ua for u € Z* and a € X. Then, 
from Definition 6.15 and idempotency of € — CI we get: 


e —CI(Ê(q0,w)) = € —Cl(€ — Cl(5 (6 (go,u),a))) = 


A A 


E —CI(8 (Ò (q0,u),a)) = Ò (qo,w) 
Finally, we derive that the word w is accepted by M, because 6'(qo,w) = 


A 


O(go,w)NF £ Ø. 
e no other case can be found. 


We have proved that a word w € * is either simultaneously accepted by both 
automata, or is simultaneously rejected by them. This completes the proof. 


6.4 Finite automata as Turing machines 


In this section evidence is given that finite automata are restricted Turing machi- 
nes. Furthermore, since finite automata always terminate their computation, they 
are restricted Turing machines with the stop property. A discussion will be focu- 
sed on deterministic finite automata. However, since both classes of finite automata: 
nondeterministic one and with €-transitions one are equivalent to the class of deter- 
ministic ones, the conclusions of this discussion concern all three classes. 

A deterministic finite automaton can be simulated by a Turing machine, which 
shifts the head right and terminates computation as soon as an input word has been 
read. The machine accepts its input if and only if the automaton accepts it. The deta- 
ils of construction of a Turing machine equivalent to a deterministic finite automaton 
are given below. 


Proposition 6.3. There exists a Turing machine with the stop property equivalent to 
a given deterministic finite automaton. 


g HUMAN CAPITAL 
A NATIONAL COHESION STRATEGY 
80 6 Finite automata 


Proof. Let a deterministic finite automaton is given 


M = (0,2,6,q0,F) 


A Turing machine with two-way infinite tape and with halting states equivalent to 
this deterministic finite automaton is: 


Mr = (Or,2,I7,6r,q0,B, {qa}, {ar}) 
where: 


e Or = QU {qa,qr} - two halting states q4 and qe are added to the set of states 
Q, qA, R ¢ Q, 
e Ir =U {B} - the blank symbol B and the input alphabet create the tape alpha- 
bet, B ¢ X, 
e the transition function 67 is described with the conditions: 
ôr (q,a) = (6(q,a),B,R) forge QO, ack, 
B,R) for qEF 
5 ,B = (qa, ’ 
nan tae for qEQ-F 
The input configuration of Turing machine Mr is: 
e an input word of the deterministic finite automaton M is stored on the tape of 
Mr, 


e the head of the tape is placed over the first input symbol, 
e the control unit of My is in the initial state qo. 


Of course, the Turing machine Mr terminates computation as soon as it reaches 
a halting state. It accepts its input if and only if the deterministic finite automaton 
M accepts this input. 

Finite automata are special cases of pushdown automata: 


Proposition 6.4. There exists a pushdown automaton equivalent to a given determi- 
nistic finite automaton. 


Proof. Let a deterministic finite automaton is given: 
M = (O,2, ô,qo, F) 


A deterministic pushdown automaton equivalent to this deterministic finite automa- 
ton is: 
Ms = (O,2,I’,6s,q0,F) 


where: 


e I = {bp} - there is only one stack symbol, the initial stack symbol, 
e the transition function ds is described with the condition: 


ôs(q,a, >) = {(6(4,4),>)}- 


Chapter 7 
Grammars versus automata 


In Chapter | regular expressions and regular languages were discussed. Regular 
grammars were also introduced. In Chapter 6 finite automata were studied. In this 
Chapter it is proved that regular grammars generate and finite automata accept the 
class of regular languages, i.e. the class of languages generated by regular expres- 
sions. In other words, it is evident that concepts of regular expressions, regular gram- 
mars and finite automata are equivalent. 


7.1 Regular expressions, regular grammars and finite automata 


7.1.1 Regular expressions versus finite automata 


Below we prove that regular expressions are equivalent to finite automata. First, we 
construct automata equivalent to regular expressions. A proof is based on the induc- 
tive definition of regular expressions, c.f. Definition 1.1 and applies mathematical 
induction. A constructed finite automaton equivalent to a given regular expression 
is an automaton with -transition. Then, a regular expression equivalent to a determi- 
nistic finite automaton is constructed. Furthermore, equivalence of finite automata 
and regular expression is draw on the above two constructs and on equivalence of 
all three classes of finite automata as shown in Figure 7.1. 


Proposition 7.1. Languages generated by regular expressions are accepted by finite 


automata. 


Proof. We use mathematical induction to prove this proposition. A formal proof of 
equivalence is done with regard to length of a regular expression, i.e. number of 
symbols in it. 


1. there are 3 families of regular expressions of length 1: 


81 


82 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


7 Grammars versus automata q 


regular 
expressions 


deterministic 
finite automata 


finite automata 
with s-transitions 
C nondeterministic oS 
finite automata 


Fig. 7.1 Equivalence of finite automata and regular expressions, a dependency diagram 


a. the empty regular expression © generates the empty language Ø. An equiva- 
lent finite automaton is shown in Figure 7.2 a 

b. the empty regular expression € generates the language with the empty word 
{£}. An equivalent finite automaton is shown in Figure 7.2 b 

c. a family of regular expressions a, for each symbol of an input alphabet a € X, 
generate languages with a one letter word {a}. An equivalent finite automaton 
is shown in Figure 7.2 c 

Note that automata shown in Figure 7.2 are nondeterministic finite automata. 


assume that two regular expressions s and ¢ are given and that these expressions 
generate languages S and T, respectively. Assume that both expressions are given 
at the same alphabet X (otherwise, take union of alphabets of both expressions 
as a common alphabet). Based on inductive assumption, take finite automata Ms 
and Mr shown in Figure 7.3, which are equivalent to regular expressions s and 
t, respectively. Now, we consider sum s+t, concatenation sot and Kleene closure 
s* of regular expressions. Languages generated by these expressions are SUT, 
SoT and S*. Finite automata (with €-transitions), which accept union and conca- 
tenation of languages S and T, and a finite automaton accepting Klenee closure 
of the language S are constructed as follows: 


a. a transition diagram of a finite automaton (with €-transitions) equivalent to 
sum s+t of regular expressions s and t is displayed in Figure 7.4, 

b. a transition diagram of a finite automaton (with €-transitions) equivalent to 
concatenation s+t of regular expressions s and ¢ is displayed in Figure 7.5, 

c. a transition diagram of a finite automaton (with €-transitions) equivalent to 
Kleene closure s* of regular a expression s is displayed in Figure 7.6, 


employing mathematical induction to 1. and 2. we conclude that there exists a 
finite automaton equivalent to any regular expression. 


Remark 7.1. Let Ms = (Os, X, 55,95, Fs) and Mr = (Or,2,5r,q) ,Fr). The finite 
automaton shown in Figure 7.4 is M = (Q,2,6,q0,F), where: 


Q = Qs U Qr U {qo, qa}, assuming that sets of states are pair wise disjoint, i.e. 
QsN Qr =9, QsN {40,44} =O and Qr N {q0,q4} = 9, 
F = {qa}, 


EUROPEAN UNION 
EUROPEAN 


sos EEN? EA 7.1 Regular expressions, regular grammars and finite automata 83 


-W -0 -O 


a) 


Fig. 7.2 Finite automata equivalent to basic regular expressions 


OE Gan 
mA =A 
Ms Mr 


Fig. 7.3 Finite automata Ms and Mr equivalent to given regular expressions s and t 


OSY 
D 
S 


Fig. 7.4 A finite automaton equivalent to sum of regular expressions s and t 


Fig. 7.5 A finite automaton equivalent to concatenation of regular expressions s and t 


e the transitions function 6 is based on transition functions of both automata Qs 
and Qr: 


o ô(q,a) = 55(q,a) for q € Qs anda € F, 

o ô(q,a) = ôr(q,a) for q € Qr anda € X, 

o ô(q,£) = ôs(q,£) for q € Qs \ Fs, 

o ô(q,£) = 6r(q,€) for q € Or \ Fr, 

o ô(q,£) = ôs(q,€)U qa for q € Fs, 

o 6(9,€) = ôr (q,€) Uqa for q € Fr, 

o 8(q,€) = {4040} 

o ô(q,€) =9 for such (q,X) € Q x (XU {€}), which are not considered above 


Proposition 7.2. Languages accepted by finite automata are generated by regular 
expression. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


84 7 Grammars versus automata q 


Fig. 7.6 A finite automaton equivalent to Kleene closure of a regular expression s 


Proof. Let assume that a deterministic finite automaton 


M= ({41,92,---,9n},2,6,q1,F) 


accepts the language L = L(M). A method of construction of a regular expression 
equivalent to the automaton M relays on construction of families of languages, 
which are obviously regular and which are easy to get regular expressions gene- 
rating them. 

The families ie ; of languages is constructed, where RK j for given natural num- 
bers i, j > 1 and k > 0, denotes a set of such words w € X*, that a computation for 
a word w, starting in the state q;, ends in the state q; = 6(gi,w) and does not visit 
states with indexes greater than k. note that indexes i and j may be greater than k. 
Let us recall that a computation of a deterministic finite automaton is a sequence of 
alternating states and input symbols. In our case, the computations for the word w 
begins with the state q; and ends with the state q;. 

Formal description of the family R$ j Of languages is as follows: 


{ac X: ôlqi,a)=q;} for iÆ#j 

R}. = 
LJ ; : 
{a€Z:d(g;,a)=qj}U{e} for i=j 
RE, = Rip! o ((Reg')*) RG; UR! for k>0 


Languages of the family R? j include one-letter words, for which there is a transi- 
tion from a state q; to a state q j. The empty word is included in languages (according 
to the rule that the empty word always allows for a transition to the same state). Note 
that any word longer than one cannot be included in a language of this family be- 
cause a computation for such a word includes a middle state with an index greater 
than 0. 

Languages of a family R 
following rules: 


k 


ij for a given i, j,k > 0, are assembled according to the 


e a computation for a word w, which does not visit states with indexes greater than 
k, but visits the state q, may be decomposed to computations which do not visit 
qk and 


o begins in q; and ends in qx, they form a language Re 


EUROPEAN UNION 
EUROPEAN 


Ne Ea 7.1 Regular expressions, regular grammars and finite automata 85 


o begins and ends in qg, they produce multiple concatenations of a language 
re i.e. they form a language Rye 


o begins in qx and ends in qj, they form a language RE, 


Concatenation of languages RE, Re and Re; form the first part of the union 
in the second formula, 

e acomputation of a word w, which does not visit states with indexes greater than 
k (except the beginning state q; and the ending state qj), may not visit the state 
qk. In that case w € Re 


ij » What gives the second part of union in in the second 
formula. 


The language L(M) is a set of such words, for which computation begins in q1, 
ends in q € F and may visit any state of M, i.e. 


L(M)= ( Rij 
j:qjEF 
Now, existence of regular expressions generating languages of families R$ j can be 
proved employing mathematical induction with regard to k: l 


e languages of the family R? j are generated by the following regular expressions: 


o fori#j, 
- if R}; = {aj,,...,@;,} forag,,...,, € X, then r?, = ai +...+4;,, 
- if R?,=0, then r), =0, 
o fori=j, 
: if R}; = {€,4;,,...,4i,}, ais- -34i € Z, then r? ; = € +a +... + ai,» 


- if RỌ, = {e}, then r?, =€, 


e based on inductive hypothesis assume that languages of a family Rý j are gene- 


rated by regular expressions re Notice that the formula for RE! is an assem- 


bly of languages of a family Ri! using union, concatenation and Kleene clo- 
sure. The assembly operators correspond to operations on regular expressions: 
sum, concatenation and Kleene closure. These notes lead to a following regular 
expression generating a RI j language for a given indexes i, j,k > 0: 


k _ k- Pt i.e a 
Tij = Tik o(a or Frij 


In conclusion, the following regular expression is equivalent to the automaton M, 
assuming that F = {qi ;qi;---;qip J: 


Fli HT Lig H. FTL ip 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


86 7 Grammars versus automata q 


7.1.2 Regular grammars versus finite automata 


Now we prove that regular grammars are equivalent to finite automata. First, we 
construct automata equivalent to right-linear grammars. Then, for given determini- 
stic finite automaton an equivalent right-linear grammar is constructed. Afterward, 
it is shown that right-linear grammars are equivalent to left-linear grammars. As a 
result we come to a graph of equivalence of regular expressions, regular grammars 
and finite automata. The graph is exposed in Figure 7.7. 


regular 


expressions a. 


deterministic nondeterministic | finite automata 


\ 


finite automata finite automata with ¢-transitions 


a a ae a Sena 
a right-linear a 


grammars 
a 
left-linear 
grammars 


Fig. 7.7 Equivalence of finite automata, regular expressions and regular grammars, a dependency 
diagram 


Theorem 7.1. Languages accepted by finite automata are generated by right-linear 
grammars. 


Proof. Let 
M = (O,2, ô,qo, F) 


is a deterministic finite automaton. 
A right-linear grammar, which generates the language L = L(M) accepted by the 
automaton M, is as follows 


G(V, T,P, S) 
where: 
e V =QU{S}, 
e T=2, 


e P -the set of productions includes the following productions: 


p —> aq, if 6(p,a) =q for pq EQ anda €F, 
p—a,if 6(p,a) €F for p€ Qandae Xs, 

S — qo, 

S— £, if qo EF. 


oOo 0 0 


EUROPEAN UNION 
EUROPEAN 


SON Ea 7.1 Regular expressions, regular grammars and finite automata 87 


Notice that a computation of the automaton M for a given word is identical with 
a derivation of this word in the grammar G. Based on this observation we show that 
a word is accepted by the automaton M if and only if it is generated by the grammar 
G. 

First, we confirm that if a word w € L(M), then w € L(G). Let w = aja2...dy € 
L*, Let qoa1qi 424i, «++i, _;4nin 18 the computation of M for the word w such that 
qi, E F, i.e. a path in a computation tree from the root to an accepting state or, in 
other words, go >* qi, in a transition relation >. A corresponding derivation in G 
for any w Æ € is: 


S—> qo > Agi, > 4142]i, — ... 7 4102. . .An—1fi, 1 T 4142.. - An—14nfin 
and then there exists a derivation of w if and only if w € L(M): 
S — qo —> 41 qi, — iagi > -e 4 A1A2Q...An—1Gi,_, T A142 -.-An—1An 


The latter derivation applies a production g;,_, —> an because 6(q;,_,,dn) € F. If 
€ € L(M), i.e. go € F, then its derivation is immediate: S — €. 

Now, we verify that if w € L(G), then w € L(M). Let w = a1a2.. .an, then there 
exists a derivation of we # in G. It is as given above, i.e. 


S — go > digh — 4) 2Gi, 4 «.. 4 A1 Anii i —> A142 Antan 


Subsequently, a computation of M for w = € is as given above as well. The com- 
putation S > go, which corresponds to the derivation S — £, leads to acceptation of 
w=€. 


Theorem 7.2. Languages generated by right-linear grammars are accepted by finite 
automata. 


Proof. Let 
G = (V, T, P, S) 


is aright-linear grammar. A finite automaton with €-transitions accepts the language 
L = L(G) generated by the grammar G: 


M= (0,2,6,q0,F) 


where: 


e Q={[a]: (AB € (VUT)*)A— Ba EP}, ice. states are labelled with all possible 
suffixes of right hand sides of productions, 

e =T, 

e qg =[S], 

e 6 -a transition function is assembled with the rules: 
o ifA €V then ô([A],£) = {[a]: A > a € P}, 
o ifac T^g E (T*UT*V) then 6([aa],a) = {[a]} 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


88 7 Grammars versus automata q 


e F={[e}}. 


Justification of correctness of the automaton construction is based on observation 
of a derivation in a right-linear grammar. 

Any intermediate derivation word is of a form xA, where x € T*, A € V. A right 
hand side of a production A — yB employed in a derivation replaces a nonterminal 
symbol A and creates a next intermediate derivation word xyB. An automaton reads 
terminal symbols inserted by a production and then switches to a state relevant to 
an inserted nonterminal symbol. This action is repeated for all productions of this 
form employed in a derivation. A derivation is terminated with a production A —> z, 
where z € T* (without a nonterminal symbol in its right hand side). In this case, an 
automaton reads terminal symbols and then goes to an accepting state marked with 
the empty word €. 

A computation of such an automaton, for a given word, follows a derivation of 
this word. States visited along a computation correspond to unread part of an input 
word. This unread part is represented by beginning terminal symbols and by a one 
nonterminal symbol, if derivation is not terminated with a production employed. 
This nonterminal symbol produces a remaining part of an input word. The last pro- 
duction of a derivation does not include a nonterminal, what allows an automaton to 
read terminal symbols and to go the accepting state. 

The above notes can be turn to a formal inductive proof with regard to length of 
derivation. 


Theorem 7.3. Right-linear grammar are equivalent to left-linear grammar. 


7.1.3 The pumping lemma 


The pumping lemma was formulated in Chapter 1, but not proved there. Here it is 
recalled and proved. 


Theorem 7.4 (the pumping lemma for regular languages). 

Tf a language L is regular, 

then there exists a constant ny such that for any word z € L the following condition 
holds: 


(1 m2) > |( V z= uwal Sm A= 1) \ zi=uvwEL 


u,v,w i=0,1,2.,... 


Proof. If a language L is a regular one, then there exists a deterministic finite auto- 

maton M = (Q,2,6,q0,F), which accepts L. Let denote the number of states of this 

automaton |Q| = n. If all words of the language L are shorter than n, then for the 

constant nz = n the implication holds because its antecedent |z| > nz is never true. 
Let z € L, z=aja2...am, where m > n. A computation of M for z is 


EUROPEAN UNION 
EUROPEAN 


SOS EEN? [A] 7.1 Regular expressions, regular grammars and finite automata 89 


9041 Gi, 424 in - - - m—14]mm-1 Em Zim 


and gi, E F. 

There are m+ 1 >n-+1 states in this computation. This means that at least one 
state is repeated, because there is only n states in M. Let us take the leftmost pair of 
repeating states. They, of course, appear in a beginning part of computation inclu- 
ding no more that n letters and no more than n + 1 states. In the following computa- 
tion leftmost repeating states are underscored 


90414i,42--- ajqijðj+1fij+1 NRS aj+plij+pfj+p+1lij+p+1 -+ -Am—1fim-1 Umi 


where p > 1. 

For that reason the part of computation between these accepting states includes at 
least one letter. If the first underlined state together with the part between underlined 
(repeated) states, i.e. ij Aj+1 ij.) +A j+p> is removed, then the remaining sequence 
of states and letters is still a computation, which ends in an accepting state: 


9041 4i, a2... ajfijyptj+p+1lij+p+1 ee Am—1 in, Amin 


On the other hand, the sequence q;,4j+19i;,, -.-4j+p May me inserted just before 


the first repeated state and an obtained sequence is still a computation, which ends 
in an accepting state: 


0414i, ++ AjQi aj HI Aj pilij j+ 1- 4j4 plij+pĵj+p+1 oe -Am-1 in Zin 


Insertion of this sequence may be repeated producing a computation, which ends in 
an accepting state. 
As a result, the computations for following words are created: 


Z = a1a2...djAj+p+1---Am—14m, 

Z = a1a2...AjAj+1..-Aj+pA4j+p+1---Am—-14m, 
Z=4102...AjAj41---Aj4plj+1 ---Aj+plj+p+1- -< Am-14m 

Z= 142...AjAj41~.--Aj4pQj41---Aj+pQAj41---Aj+paj+p+1---Am—-14m> 
etc. 


and the above computations end in an accepting state. Notice, that a part of a word, 
which is repeated, has at least one letter. Moreover, both the beginning part and a 
repeated part are not longer than nz = n. Therefore, we have a sequence of words as 
required in the consequent of the implication. This proves the lemma. 


As a consequence of the pumping lemma, it may be concluded that computations 
of a finite automaton are determined by a finite set of words, which are not longer 
that a constant nz. A computation for a word, which is longer than nz, can be shor- 
tened by removing its inner part(s), as in the pumping lemma. This implies that a set 
of accepting states of an deterministic finite automaton can be effectively calculated 
by investigating a finite set of such words, which are not longer than the constant nz. 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


90 7 Grammars versus automata q 


Computations for longer words cannot bring a new accepting state. This conclusion 
can be formally expressed as follows. 


Remark 7.2. For any word z € L, |z| > nz, there exists a word w € L, |w| < nz such 
that the computation for the word z is œ; = 10203 and the computation for the 
word w is al pha,, = 03. Note that a may include many different repeating parts. 


7.1.4 The Myhill-Nerode Theorem 


In this section the Myhill-Nerode theorem is formulated and proved. The Myhill- 
Nerode lemma, which was used in Chapter 1, is a direct consequence of the Myhill- 
Nerode theorem. 


Theorem 7.5 (the Myhill-Nerode Theorem). 
The following conditions are equivalent: 


1. a language L is accepted by a deterministic finite automaton M = (Q,2,6,q0,F, 

2. a language L is a union of some classes of a right invariant equivalence relation 
with finite index, 

3. the relation Ry induced by a language L has finite index. 


Proof. The following implication between the above conditions will be shown: 
1>=2>3>1. 


1>2 


Assume that a deterministic finite automaton M = (Q,2,6,q0,F) is given. Let 
us define a relation py C X* x X* such that for any x,y € 2*, xpmy = 6(qo,x) = 
5(qo,y). The relation is: 


e aright invariant relation because: 
(Yx, y,z € £*)6(qo,x) = 8(q0,y) = (q0,xz) = 8(qo,y2), 

e an equivalence relation since the equality relation is an equivalence relation. It is 
obvious that py is 


— reflexive, i.e. (Vx € L*) d(qo,x) = (qo,x), 
— symmetric, i.e. (Vx,y E £*)5(go,x) = ô(q0,y) = 5(go,y) = 5(qo,x), 
— transitive, i.e. 

(Vx,y,z € Z*) 8(qo,x) = 6(g0,y) 

wedged (qo,¥) = (q0,z) = 5(qo,x) = 5(q0,2), 


It is evident that all words, for which computation ends in the same state, create 
equivalence class. As a conclusion, we come to the conclusion that the number of 
equivalence classes is equal to the number of states |Q| of the automaton M. It is also 
evident that the language L is a union of those equivalent classes, which correspond 


EUROPEAN UNION 
EUROPEAN 


SNe Ea 7.1 Regular expressions, regular grammars and finite automata 91 


to accepting states. 
2=>>3 


For any x,y € £*, if xpmy, then either x,y € L, or x,y ¢ L (because L is a union of 
some equivalence classes of pm). Moreover, for any z € 2*, if xpyy, then xzpyyz, 
i.e. either xz,yz € L, or xz,yz ¢ L, (because is a right invariant relation). For that 
reason (Vx,y E€ 2*)xpywy = xRzy. Asa conclusion, we have that every equivalence 
class of the relation Py is included in some equivalence class of the relation Rz in- 
duced by the language L, Then we get that Rz has no more equivalence classes than 
pm has, i.e. the number of equivalence classes of Rz is finite. 


3>1 


Assume that the relation Rz induced by the language L has finite number of equ- 
ivalence classes. The following deterministic finite automaton accepts the language 
L: 

M = (O,2, ô,qo, F) 


where: 


Q= {aiw : w € L*} - states correspond to equivalence classes of Rz, 

e ZŁ -an alphabet of the language L, 

© qo = ej - a state corresponding to the equivalence class [e], which includes the 
empty word g, is the initial state, 

e F= {aw :w € L} - accepting states correspond to equivalence classes, which 
are included into the language L, 

e 6 -atransition function is defined by the formula (q w] a = wa}, for any qiw) E Q 
and any a € X, where [w] is an equivalence class of the relation Rz represented 
by a word w € 2*. 


The automaton M constructed above accepts the language L because: 


e for any w € L, 6(40,w) = Ò(qje],w = Aw] (simple inductive proof justifies this 
evidence), i.e. 6(qo,w) € F, 
e likewise, for any w ¢ L, ô(qo,w) ¢ F. 


The proof is completed. 


7.1.5 Minimization of deterministic finite automata 


The Myhill-Nerode theorem permits to minimize deterministic finite automaton. 
First of all, note that the relation Rz induced by a regular language L is a most ge- 
neral equivalence relation defining a language L. Namely, an equivalence relation 
defines a language if and only if a language is a union of some equivalence clas- 
ses of this relation. Recall, that an equivalence relation E; is more general than an 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


92 7 Grammars versus automata q 


equivalence relation E> if and only if equivalence classes of E2 are included into 
equivalence classes of £4. 

Secondly, a deterministic automaton constructed in the proof of the third impli- 
cation is a minimal one with regard to number of states. If this would not be true, 
then we would be able to build a deterministic automaton M’, which has less sta- 
tes than the automaton M constructed in the proof. However, the relation pm would 
have less equivalence classes than the relation Rz. But this is not possible due to the 
second implication considered in the proof of the Myhill-Nerode theorem. 

Finally, it is obvious, that an automaton constructed in the proof of the Myhill- 
Nerode theorem, is a minimal one with regard to the number of states. 


7.2 More grammars and automata 


7.2.1 Context-free grammars versus pushdown automata 


In this section relations between context-free grammars and pushdown automata 
are discussed. It is shown that pushdown automata are equivalent to context-free 
grammars. Thus, a class of languages accepted by pushdown automata is the class 
of context-free languages. 


Theorem 7.6. Languages generated by context-free grammars are accepted by 
push-down automata. 


Proof. Let a context-free grammar is given and the empty word is not generated in 
the grammar. We construct a pushdown automaton accepting the language generated 
by this grammar. The automaton accepts with the empty stack. 

We assume that a given context-free grammar 


G=(V,T,P,S) 


is in Greibach normal form. Let us recall that productions of a grammar in Gre- 
ibach normal form are A > aa, where A € V, a E€ T and & € V*. For a given word 
w € L(G) a leftmost derivation in G is considered. A pushdown automaton, when 
computes a given word w, follows this leftmost derivation in G. A pushdown auto- 
maton equivalent to the grammar G is: 


M = ({40,4},T,V U {©}, ô,q,>,0) 


where the transition function is constructed as follows: 


e begin with a given word w € T* at the input of M and with the initial symbol S 
of the grammar G on the stack, 

e accept if the end-of-input symbol < is at the input and the initial stack symbol > 
is on the stack, 


EUROPEAN UNION [| 
EUROPEAN 
SOCIAL FUND 7.2 More grammars and automata 93 


e if a €T is an input symbol, A € V is a top symbol of the stack and there is a 
production A —> aq in the grammar G, œ € V*, then read the input symbol and 
replace the top symbol of the stack with œ (remove A form the stack, push on the 
stack symbols of & in reverse order, 

e reject in all other cases. 


These rules could be re-written as follows: 


5(q0,€,>) = {(¢,S>)}, 
ô(q0,a,A) = {(qg,A):A > aa € P}, 


ô(q,<,>) = {(4,£)}. 


Modification of the construction for the case when € is included in the language 
is fairly easy. The automaton should be able to pop the top symbol of the stack up in 
its first transition, i.e., the first rule of the presented above should be replaced with: 


e ô(q0,£,>) = {(4,S©), (4,€)}, 


A formal proof is based on mathematical induction with regard to length of deri- 
vation. 


Notice that an automaton constructed in Theorem 7.6 is, in general, nondetermi- 
nistic one. Nondeterminism is raised by ambiguity of a grammar. If a grammar in 
Greibach normal form is simple, i.e. it satisfies the Greibach uniqueness condition, 
then an automaton is a deterministic one. 


Theorem 7.7. Languages accepted by push-down automata are generated by context- 
free grammars. 


Proof. Assume that an automaton is given M = (Q,2,I',6,qo,>,F. A grammar 
that is equivalent to M should keep track of transitions made my M. Intuitively, a 
computation should be followed by a left-most derivation in a grammar in Greibach 
normal form. Every transition of an automaton should be accompanied by a set 
of productions. Productions are containers to store states and symbols employed 
in transitions. Left hand side of such a production is a top symbol of the stack 
accompanied with states before the transition is done and after it is done. Right 
hand side of it is a sequence of symbols, which are put on the stack. These symbols 
are accompanied with states, that are employed in a transition with this stack symbol 
and these states. This sequence is preceded by a terminal symbol, which is an input 
symbol consumed in this transition. 
Now, let us give a detailed description of the grammar G = (V,T,P,S) 


e V=0xrxQU{S}, 
e T=2, 
e productions are: 


o for all S + (go,,q) for allq € Q, these productions initiate simulation of a 
computation, 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


94 7 Grammars versus automata q 


o if d(g,a,X) includes (r,Aj,A2,...,Am) for a € © Uf{e}, q,r€Q and 
X,A1,A2,...,Am ET, then the following productions should be added for 
all r1,r2,---,;%m E Q: (q,X,1n) > a(r,A1,ni)(71,A1,72)---(fn—-1,A1, Tn), Le. 
a transition (7,41,A2,...,Am)) E 6(q,a,X) is developed in such a way, that 
(1) a state r after the first switch from a state Q is preserved in all productions, 
(2) a final state r„ after employing the series of transitions for stack symbols 
A,A2,.-.,Am ET is preserved as well and (3) a track of states along the series 
of transitions is kept, 

o if 6(q,a@,X) includes (r,£) for q,r € Q and a € YU {€}, ie. a top symbols 
of the stack is pop out, then a corresponding nonterminal symbols from an 
intermediate word of a derivation: (q,X,r) > a. 


Chapter 8 
The hierarchy 


8.1 More operations on languages 


8.1.1 Substitutions, homomorphisms 
Definition 8.1. Let X and A are alphabets. A mapping f of letters of an alphabet £ 
into languages over an alphabet A: 
f:E => 2% 
is called substitution. A substitution on an alphabet £ can be extended 


e to words over an alphabet £: 


ie ae 
fle) =e 
f(wa) = f(w)o f(a) (Wwe 2*) (Va € X) 


e and to languages over an alphabet X: 


eee 
FL) = Uf) WoL) 
weL 


This general definition of a substitution may be restricted to a class of languages 
assuming that values of a substitution are languages of this class as well as argu- 
ments of a substitution are languages of this class as well. In this section discussion 
on substitutions is restricted to regular languages. Explicitly, we assume that substi- 
tutions are mappings: f : £ — RgL(A), f:2* > RgL(A) and f : RgL(Z) > RgL(A), 
where RgL(x) and RgL(A) are the classes of regular languages over alphabets £ and 
A, respectively. 

The definition of a substitution could be reformulated to regular expressions. 


95 


g HUMAN CAPITAL 
ei NATIONAL COHESION STRATEGY 
96 8 The hierarchy 


Definition 8.2. Let X and A are alphabets. A mapping F of letters of an alphabet £ 
into regular expressions REx(A) over an alphabet A: 


f:{a:a €E} — REx(A) 


A substitution on an alphabet X can be extended to regular expressions REx(Z) over 
an alphabet X: 
f: REx(Z) > REx(A) 


such that 
f(0) =0 
fle)=e 
f(a) = according to definition of f (Va € X) 
and 
f(st+t) = (f(s) + fŒ) 
f(sot) = (F(s) F) 


where @, € and a for a € X are basic regular expressions, s and ¢ are regular expres- 
sions (used in inductive step of the definition). 


Remark 8.1. In the class of regular languages, substitutions can be considered alter- 
natively with regard to languages or with regard to regular expressions generating 
these languages. 


Definition 8.3. Let © and A are alphabets. A substitution h : © — 24", such that 
|h(a)| = 1 for all a € X, is called a homomorphism. In other words, a homomor- 
phism is such a substitution, which yields one word languages. An extension of a 
homomorphism to words and languages is as a relevant extension of a substitution. 


Remark 8.2. A homomorphism A is identified with a mapping h : © — A*, which 
yields a words over an alphabet A for a letter of an alphabet 2 rather than a langu- 
ages including only this word. 


Definition 8.4. Let h : © — 24° is a homomorphism. An inverse homomorphic 
image of a word z € A* is a set of words (language): 


h-'(w) ={x EZ: A(x) =w} 
An inverse homomorphic image of a language L C A* is a set of words: 


h! (L) = Ja!) = {x€ £* : h(x) € L} 


wEL 


EUROPEAN UNION S| 
EUROPEAN 
SNe 8.1 More operations on languages 97 


8.1.2 Quotients 


Quotients of languages are actually applied to words. Quotient of words is essen- 
tially the opposite of concatenation. A quotient of words is a kind of reduction of 
one word, a dividend, by another one, a divisor. Two types of quotients are defined: 
the right quotient and the left quotient. 


Definition 8.5. Let Lı and L2 are languages over an alphabet X. 

The right quotient of a language Lı with a language Ly is the language Lı /L2 
consisting of such words over the alphabet X words, which concatenated with words 
of the divisor give words of the dividend: 


Li /L = {x € X* : (3y € Ly)xy€ Ly} 


The left quotient of a language Lı with a language L is the language Lı \L2 
consisting of such words over the alphabet X words, which concatenated to words 
of the divisor give words of the dividend: 


Li\L2 = {x € te (Ax € Ly)xy E€ Lı} 


Remark 8.3. The definition of quotients of languages could be reformulated to regu- 
lar expressions instead of languages. Such a formulation corresponds, of course, to 
quotients of languages generated by regular expressions, i.e. to quotients of regular 
languages. 


8.1.3 Automata building with quotients 


A language L = L(M) accepted by a deterministic finite automaton M = (Q, £, ô, qo, F) 
is a set of words L = {w € L* : ô(qo,w) € F}. Let consider languages: 


e Ly = {w E Sigma* : 8(qa,w) € F}, where ô(q0,a) = qa for a € X. These langu- 
ages are accepted by deterministic automata M = (Q,2,6,qa,F). Note that La 
is derived form L by removing the first letter from words of L, i.e. La = {u € 
Sigma* : (Jw € L)au = w}. The last formula defines the left quotient of the lan- 
guage L with the language {a} (this language includes one word of unit length). 
i.e. La = L\ {a}, 

© Lay = {w E Sigma* : 5(qay,w) E F}, where qap = Ô(q0,ab) for a,b € E. Lap 
is derived form L by removing two leading first letters from words of L or - in 
other words - Lp is derived from La by removing the first letter from words of it: 
Lap = L\ {ab} = La\{b}, 

© Labe = {w € Sigma” : (qabe, w) E F}, Ô(q0,a) = dap for a,b,c € L*, 

e ldots. 


How many languages do we get in the above process of quotients? As many as 
words in the language L, at a glance. But, since languages are tied to states of a finite 


g HUMAN CAPITAL 
<i NATIONAL COHESION STRATEGY 
98 8 The hierarchy 


automaton, we get no more languages that the number of states. On the other hand, 
languages are computed as quotients by words over the alphabet of this language 
and may be tied to any deterministic finite automaton including an automaton with 
minimal number of states. Hence, the number of languages does not depend on 
a particular automaton. From the Myhill-Nerode theorem we conclude that these 
languages are tied with equivalent classes of the relation Rz induced by a regular 
language L rather then with a particular deterministic finite automaton accepting L. 
However, these languages are not equivalence classes of Rz. Moreover, they are not 
equivalence classes of any equivalence relation. 

Now, consider which of the above languages correspond to equivalence classes 
of Rz included in the language L. Take a particular language L,. Note that there 
are many u € £* defining the same language. In fact, a set of words defining a 
particular language can be written as Ey = {u € X* : Lu = Lw}. If an automaton 
M is a minimal one, then Ey, = {u € L* : 6(go,u) = ô(qo,w)} is an equivalence 
class of Rz, c.f. the Myhill-Nerode theorem. If v is a shortest word in Ew, then the 
state ô (qo, v) = 5(qgo,w) is accepting one if and only if v € L, what is equivalent to 
E E€ Ly- 

As a conclusion of this discussion, we get the following proposition: 


Proposition 8.1. A regular language L over an alphabet &* is accepted by a deter- 
ministic finite automaton 
M = (O,2, 6,90,F) 


where: 


e Q= {qz,,: Ly =L\{w} Aw E X*}, ie. states are labelled by quotients of lan- 
guages, 

° =q, 

e F= {larp :£€Lw> 

© (qL, 4) = qLra fora E E, w E€ LL". 


Moreover, this automaton is minimal one (with regard to number of states) accepting 
L assuming that all equal languages are identified. 


8.2 The hierarchy of languages 


Proposition 8.2. The class RgL of regular languages in included, but not equal to 
the class CFL of context-free languages. 


Proof. According to Theorem 7.1 and Theorem 7.3 regular languages are generated 
by right-linear grammars. On the other hand, right-linear grammars are context-free 
ones. Thus, we have inclusion. Moreover, the language L = {w € {a,b}* : Haw = 
#pw > 0} is not a regular one, but it is a context-free one. 


Proposition 8.3. The class CFL of context-free languages in included, but not equal 
to the class CSL of context-sensitive languages. 


EUROPEAN UNION Ei 
EUROPEAN 
eee 8.2 The hierarchy of languages 99 


Proof. Again, context-free languages are generated by context-free grammars. On 
the other hand, context-free grammars are also context-sensitive ones, which gene- 
rate context-sensitive languages. Moreover, the language L = big{w € {a,b,c}* : 
Haw = #pw = #.w > 0} is not a context-free one, but it is a context-sensitive one. 
Thus, we have inclusion, but not equality. 


Lemma 8.1. There is a Turing machine with the stop property (an algorithm) to 
check if a given word z = a\a2...dy is generated by a given context-sensitive gram- 
mar G = (V,T,P,S). 


Proof. The Turing machine realizes a shortest paths algorithm. Let us build a graph 
with nodes labelled by all words w € (V UT)*, which are not longer than n. Note 
that the initial symbol of the grammar S and the word z are among labels of nodes. 
Two nodes labelled with words u and v are connected with a directed edge if and 
only if there is a direct derivation of v from u in G. In this way, w € L(G) if and 
only if there is a path in this graph from the node labelled S to the node labelled 
z. A Turing machine with the stop property could be built, which checks existence 
of such a path and find the path. Such a machine implements an algorithm for path 
searching in a graph. 


Lemma 8.2. The set of context-sensitive grammars is countable, i.e. context-sensitive 
grammars can be enumerated with natural numbers. 


Proof. In fact, we can encode context-sensitive grammars as natural numbers. As- 
sume that G = (V,T,P,S) is a context sensitive grammar. Then the following is done: 


e terminal and nonterminal symbols are replaced with binary numbers represented 
by a blocks of digits of fixed lengths. The number of binary digits necessary for 
encoding terminal and nonterminal symbols and an additional symbol is equal 
to p = | log,(|V| + |7|+!)| +1. Assume that the number 2? — 1 enumerates a 
special symbol, a separator. It represented as the block 111...11 of p binary 
digits 1, 

e nonterminal symbols are enumerated by successive natural numbers starting with 
0 (represented as the block 000...00 of p binary digits 0). Assume that the be- 
ginning symbol S of a grammar is enumerated with 0. Of course, every number is 
represented by a string of p binary digits, some or all of them with nonsignificant 
Zeros, 

e enumeration of terminal symbols is continued with successive natural numbers 
following enumeration of nonterminal symbols, 

e productions are represented as sequences of p-digits blocks enumerating sym- 
bols. The special symbol (i.e. the block of p ones) separates both hand sides of 
productions, 

e a grammar is encoded as the following sequence of blocks of p binary digits: 


o encoding begins with two separators (two blocks of 1s), 
o nonterminal symbols (|V | blocks of binary digits, the first one is the block of 
Os), 


g HUMAN CAPITAL 
m NATIONAL COHESION STRATEGY 
100 8 The hierarchy 


the separator, 
terminal symbols (|T | blocks of binary digits), 
the separator and a production, these two elements are repeated for every pro- 
duction, 
o encoding ends with two separators. 


As a result, we get a binary number that encodes the given grammar G. 

Note that at least one context-sensitive grammar can be encoded as a given natu- 
ral number and not every number encodes a grammar, i.e. such numbers, for which 
binary representation is not valid code of any grammar. However, we can assume 
that numbers, which are not valid codes of context-sensitive grammars, encode a 
grammar generating the empty language. Likewise, not every natural number repre- 
sents a word over the set of terminal symbols T, for instance, binary words shorter 
than p. But we can treat such natural numbers as not generated by the grammar. 

This encoding is ambiguous, i.e. a given grammars can have many codes. For 
instance, an order of symbols or productions affects the result of encoding. Any- 
way, all context-sensitive grammars are encoded as natural numbers and no number 
is a code of two grammars. Therefore, grammars can be ordered according to the 
smallest codes, what gives a method of enumeration of grammars and accessing 
the grammar encoded as a given number. Simply, take binary representation of suc- 
cessive natural numbers and then check, if it is a correctly encoded grammar. If a 
grammar of a given code is searched, continue this process until this code is found. 
Of course, any grammar can be identified in this way. 


Proposition 8.4. The class CSL of context-sensitive languages in included, but not 
equal to the class RkL of recursive languages. 


Proof. Context-sensitive languages are accepted by linear bounded automata. Li- 
near bounded automata are restricted Turing machines with the stop property. Re- 
cursive languages are accepted by Turing machines with the stop property. Thus, we 
have inclusion of the class CSL in the class RkL. 

Now we construct a language that is in RkL class, but not in CSL class. Let us 
build a relation r C N x N. A pair (k,l) of natural numbers belongs to this relation 
if and only if the context-sensitive grammar encoded as / generates the binary word 
at kth place in the canonical order, i.e. rg; = 1 if the /th grammar generates the kth 
word, rz = 0 otherwise. Consider the language of words, which are not generated 
by the corresponding grammar, i.e. with 0 at the main diagonal in Table 8.1. This is 
so called diagonal language Ly = {w € {0,1}* : w = w; A rij = Obig} in the class 
CSL. We will come to contrary, if we assume that Ly is context-sensitive. If it is 
context-sensitive, then - due to Lemma 8.2 - it is generated by a context-sensitive 
grammar encoded as some natural number, say number k and denote it Gz. Consider 
the word wx in canonical order. If it is in Lg, then rg by definition of Ly. However, 
rg x Means that G; does not generate wg, though it should. On other hand, if wg does 
not belong to Lg, then rg by definition of Ly. However, rg means that Gg generates 
wx, though it should not. Thus, the diagonal language L, in the class CSL cannot be 
a context-sensitive one. 


EUROPEAN UNION Ei 
EUROPEAN 
RERIN 8.2 The hierarchy of languages 101 


The language Ly is accepted by a Turing machine with the stop property. Such a 
machine realizes the following algorithm: 


e finds the number k of a given binary word in canonical order, i.e. w = wz, 

e finds the context-sensitive grammar Gg encoded as the number k, 

e checks, if the grammar Gx generates the word w = wx or not. The method shown 
in Lemma 8.1 can be employed for checking. 


This method allows answering the question, if any word is generated by a given 
grammar or not. This shows that the diagonal language Ly in the class CSL belongs 
to the RkL class. In this way we have proved that the CSL class is included, but not 
equal to the RkL class. 


5(q0) 0 1 2 3 = k 
wo =€ 70,0 T0,1 10,2 10,3 ro,k 
w; =0 r10 ria ri r13 rik 
w2=1 T1,0 rit r12 r13 rik 
w3 =00 || rio ria ri2 r13 rik 

Wk Tko Tk, rk2 rk3 ? 


Table 8.1 The membership table for context-sensitive grammars. 


Lemma 8.3. The set of Turing machines is countable, i.e. Turing machines can be 
enumerated with natural numbers. 


Proof. The proof is similar to the proof of Lemma 8.2. We encode Turing machines 
as natural numbers. Assume that M = (Q,£,T',ô,qo,B,{q4}) is a Turing machine 
with a halting accepting state. Then we encode the machine M as a binary number 
in the following way: 


e states, symbols of the tape alphabet I, symbols of the input alphabet £ and 
two symbols of directions of the head move are replaced with binary numbers 
represented by blocks of digits of fixed length. An additional symbol, a separator, 
is included in the set of codes. The number of binary digits necessary for such 
an encoding is equal to p = | log,(Z|| + |I| +|Q| +3)| +1. Assume that the 
number 2? — 1 enumerates a special symbol, a separator. It represented as the 
string 111...11 of p binary digits 1, 


g HUMAN CAPITAL 
m NATIONAL COHESION STRATEGY 
102 8 The hierarchy 


e states are enumerated by successive natural numbers starting with 0. Assume that 
the initial state is enumerated with the number 0 and the accepting state - with the 
number 1 (of course, every number is represented by a block of p binary digits, 
some or all of them with nonsignificant zeros), 

e enumeration of symbols of I” is continued with successive natural numbers fol- 
lowing enumeration of states, 

e then enumeration of symbols of £ is continued with successive natural numbers 
following enumeration of symbols of I’, 

e ” then enumeration of directions of the head move is done with next two succes- 
sive natural numbers following enumeration of states, 

e every transition 6(q,X) = (p,Y,D) is represented as five p-digits blocks, namely 
these blocks represent arguments of the transition function (a state q and a tape 
symbol X) and result (a state p, a tape symbol Y and a direction D, 

e a Turing machine is encoded as the following sequence of blocks of p binary 
digits: 


the encoding begins with two separators (two blocks of 1s), 

states (|Q| blocks of binary digits, the first one is the block of Os), recall that 
the initial state is encoded with the number 0, the accepting state is encoded - 
with the number 1, 

the separator, 

symbols of I, (|I | blocks of binary digits), 

the separator, 

symbols of X, (|X| blocks of binary digits), 

the separator, 

directions of the head moves (2 blocks of binary digits), 

the separator, 

the separator and a transition, these two elements are repeated for every trans- 
ition (every entry of the transition table), 

o the encoding ends with two separators. 


D T 0O 0 A-0 g 


This encoding is ambiguous, as a similar encoding in Lemma 8.2. Natural num- 
bers, which are not valid codes of a Turing machine, are assumed to be codes of a 
machine falling in infinite computation for every input. Binary words not represen- 
ting any word over 2 are assumed to not be accepted by a Turing machine. 

Finally, we can conclude that all Turing machines can be enumerated with natural 
numbers. 


Proposition 8.5. There are languages, which are not recursively enumerable. 


Proof. The set of all words X* over a given alphabet ¥ in infinite and countable. 
The class of languages ALL = {L: L C L*} is the power set of £*, so then it is 
uncountable. On the other hand, languages of the REL class are accepted by Turing 
machines. The set of Turing machines is countable, c.f. Lemma 8.3, so the class 
REL of languages is countable. Since the class ALL is an uncountable set, then it 
cannot be equal to its countable subset. This proves that are languages, which are 
not recursively enumerable. 


EUROPEAN UNION [eI 
EUROPEAN 
AKS 8.2 The hierarchy of languages 103 


Now, let construct a language that is not recursively enumerable. Let r C N x N 
is a relation build in a similar way as in Proposition 8.4, i.e. a pair (k,l) of natural 
numbers belongs to this relation if and only if the Turing machine encoded as 1 
accepts the binary word at kth place in the canonical order. In other words, rg; = 1 
if the /th Turing machine accepts the kth word, rg; = 0 otherwise. Note, that rg; = 0 
means that the /th Turing machine either terminates computation and rejects its 
input or it is doing infinite computation. Note that the diagonal language Ly = {w E€ 
{0,1} : w = wi A rii = 0} in the class REL built on this relation is not recursively 
enumerable. Again, as in Proposition 8.4, we will come to contrary, if we assume 
that Ly is recursively enumerable one. If Lg is in REL class, then there exists a 
Turing machine, which accepts it, say the kth machine Mg. Consider rg x. If rk = 1, 
then wx is accepted by Mg, so wg € Lg, but it should not due to definition of Lg. If 
rk k = 0, then wz is not accepted by Mz, so wg ¢ Kg, but it should due to definition 
of La. Thus, the diagonal language Ly cannot be a recursively enumerable one. 


Lemma 8.4. The universal language Lq is a recursively enumerable one. 


Proof. We construct a Turing machine, which accepts the universal language Ly. 
This is so called the universal Turing machine M,,. The machine has three tapes. It 
realizes the following algorithm: 


1. checks if an input word is of a form (M)w, where (M) is a valid code of a Turing 
machine, 


o looks for beginning sequence of 1s, if it is of even length 2r, rejects the input, 

o stores r Os on the third tape, content of the third tape is used as a measure of 
length of the blocks of binary digits encoding the machine and as a number of 
current state, 

o looks for the next sequence of 2r 1s, 

o moves the beginning part of the input word bounded by both blocks of 2r 1s 
to the second tape, leaves w on the first tape, places the head of the first tape 
on the leftmost symbol of w, 

o checks if content of the second tape is a valid code (M) of a Turing machine, 


2. repeats the following actions until the number stored on the third tape encodes 
the accepting state, 


o for the state q stored on the third tape and for the symbol X stored as a block 
of r binary digits with the head of the first tape placed on the leftmost digit 
of this block retrieve a matching transition 6(q,X) = (p,Y,D). There may be 
more than one matching transition, so this is nondeterministic choice, 

o replace content of the third tape with p, replace X by Y and move the head of 
the first tape to neighboring clock in direction described by D. 


Lemma 8.5. The universal language L, is not a recursive one. 


Proof. Let assume that L, is recursive, i.e. that a Turing machine M’ with the stop 
property accepts Lu. Based on this assumption the machine M4 accepting the dia- 
gonal language in the class REL can be build, what contradicts Proposition 8.5. 
Therefore, the universal language cannot be recursive. 


g HUMAN CAPITAL 
m NATIONAL COHESION STRATEGY 
104 8 The hierarchy 


A hypothetical Turing machine M4, assumed to accept the diagonal language in 
the class REL, realizes the following algorithm: 


e fora given input word w, Mg, retrieves the index k of w in the canonical order, i.e. 
it finds such i, that w = w;, 

e Mz, retrieves binary representation (M+) of the kth Turing machine Mz, 

e Mı concatenates binary representation (Mg) of the kth Turing machine My with 
W, 

e M, simulates computation of the hypothetical machine M’ for the concatenation 
of (Mx) and w, 

e M, terminates computation if and only if M’ terminates its computation, then M4 
reverses an output of M’, i.e. Mz accepts if and only if M’ rejects. 


Note that M4 accepts if and only if the Turing machine encoded as k rejects 
w = Wr. Moreover, Mq has the stop property since the hypothetical Turing machine 
M' is assumed to have the stop property. 


Remark 8.4. The following inclusions hold based on discussion in this section 


RgL C CFLC CSL C RkLC REL C ALL 


The following inclusions are called the Chomsky hierarchy 


RgL C CFL C CSL C REL C ALL 


where C denotes inclusion, but not equality. 


8.3 Closeness 


In this section we will consider classes of languages examined in the previous sec- 
tion: regular, context-free, context-sensitive, recursive, recursively enumerable clas- 
ses of languages, i.e. RgL, CFL, CSL, RkL, REL classes, and the class of all langu- 
ages, the ALL class. 

Let us examine this problem from two points of view. The first attempt employs 
grammars, which generate languages of relevant classes. This attempt allows sho- 
wing that union is an inner operation in all classes of languages generated by gram- 
mars, i.e., in RgL, CFL, CSL, RkL, REL classes of languages. The second attempt 
is based on automata. It allows for proving union to be an inner operation in RgL, 
CFL, CSL, REL classes. We give proofs for both attempts, despite that in this way 
proofs are duplicated for some classes. We think that these proofs play utilitarian 
roles for discussion on closeness as well as they provide useful techniques, which 
could be applied in solving other problems. 


Remark 8.5. Taking a subset of a given set is the very basic operation. However, it 
is not an inner operation in any class of languages, except the ALL class. It means 


EUROPEAN UNION 
SOCIAL FUND 8.3 Closeness us 


that a subset of a language of any class may not belong to this class, besides the ALL 
class. 


Remark 8.6. The ALL class is closed with regard to any operation. Therefore, further 
discussion does not take this class into account. 


Proposition 8.6. Union is an inner operation in all classes of languages. 
Proposition 8.7. Concatenation is an inner operation in all classes of languages. 
Proposition 8.8. Kleene closure is an inner operation in all classes of languages. 


Proposition 8.9. Substitution is an inner operation in RgL, CFL, CSL and REL 
classes of languages. 


Proposition 8.10. Intersection is an inner operation in RgL, CSL, RkL and REL 
classes of languages. 


Proposition 8.11. Intersection is not an inner operation in CFL class of languages. 


Proposition 8.12. Complement is an inner operation in RgL, CSL and RkL classes 
of languages. 


Proposition 8.13. Complement is not an inner operation in CFL class. 
Proposition 8.14. Complement is not an inner operation in REL class. 


Proposition 8.15. The inverse homomorphic image is an inner operation in the 
class of regular languages. 


Proposition 8.16. Quotients (left and right) are an inner operation in the class of 
regular languages. 


106 


8 The hierarchy 


HUMAN CAPITAL 


NATIONAL COHESION STRATEGY 


Table 8.2 Closeness of operations on languages 


is inner in RgL | CFL | CSL | RkL | REL | ALL 
union t t H 
concatenation } H + t H 
Kleeneclosure H H t H 
substitution + + + m Ae 
homomor phism + + + + te 
intersection } | 
complement | 
inv. hom. image + +$ 
quotients + =f 


Index 


€-production, 10 Turing machine, 36, 51 
configuration 
algorithm finite automaton, 69 
Cocke-Younger-Kasami, 22 configuration of 
automaton finite automaton, 67, 74 
finite, 65, 69, 74, 81, 83, 84, 86, 87 pushdown automaton, 57 
closure of transition function, 71, 76 Turing machine, 34, 42, 48 
closure of transition relation, 68 context-free 
computation, 67, 70, 76 grammar, 7 
configuration, 67, 69, 74 language, 8 
deterministic, 65 context-free grammar, 92, 93 
epsilon closure, 74 
input accepted, 68, 70, 76 derivation 
language accepted, 68, 70, 76 tree, 8 
step description, 67, 69, 74 
transition relation, 67, 70, 76 epsilon closure 
linear bounded, 53 finite automaton, 74 
pushdown, 55, 92, 93 equivalence of 
accepting with empty stack, 60 Turing machines, 37 
computation, 59 
configuration, 57 finite automaton, 65, 74, 81, 83, 84, 86, 87 
deterministic, 59 closure of transition function, 71, 76 
input accepted, 59 closure of transition relation, 68 
language accepted, 59 computation, 67, 70, 76 
step description, 57 configuration, 67, 69, 74 
transition relation, 58 epsilon closure, 74 
input accepted, 68, 70, 76 
basic model with guard of language accepted, 68, 70, 76 
Turing machine, 38 nondeterministic, 69 
step description, 67, 69, 74 
Chomsky normal form, 14 transition relation, 67, 70, 76 
cleaning procedure of function computed by 
Turing machine, 34 Turing machine, 36 
Cocke-Younger-Kasami algorithm, 22 
computation of grammar 
finite automaton, 67, 70, 76 ambiguous, 8 
pushdown automaton, 59 Chomsky normal form, 14 


107 


108 


g HUMAN CAPITAL 
NATIONAL COHESION STRATEGY 
Index 


context-free, 7, 92, 93 
Greibach normal form, 16 
left-linear, 4, 88 
LL(1), 26 

uniqueness condition, 26 
production 

€-production, 10 

unit, 12 
regular, 4, 86-88 
right-linear, 4, 86-88 
symbol 

nullable, 10 

useless, 9 
translation, 25 

Greibach normal form, 16 


halting accepting state of 

Turing machine, 37 
halting state of 

Turing machine, 38 
homomorphism, 96 

of languages, 96 


input accepted by 
finite automaton, 68, 76 
pushdown automaton, 59 
Turing machine, 36, 51 


language 
context-free, 8 
expression, 2 
homomorphism, 96 
inherently ambiguous, 8 
quotient, 97 
substitution, 96 
language accepted by 
finite automaton, 68, 70, 76 
pushdown automaton, 59 
Turing machine, 36 
lemma 
Myhill-Nerode, 3 
Ogden, 21 
pumping, 3, 18, 88 
linear bounded automaton, 53 
LL(1) grammar, 26 


Myhill-Nerode 
lemma, 3 
Myhill-Nerode theorem, 90 


nondeterminism, 50 
basic assumption, 50 
degree, 51 
interpretation, 50 


nondeterministic 
finite automaton, 69 
pushdown automaton, 55 
Turing machine, 49 
nullable symbol, 10 


Ogden lemma, 21 


pumping 
lemma, 3 
pumping lemma, 18, 88 
pushdown automaton, 55, 92, 93 
accepting with empty stack, 60 
computation, 59 
configuration, 57 
deterministic, 59 
input accepted, 59 
language accepted, 59 
step description, 57 
transition relation, 58 


quotient, 97 
of languages, 97 


regular 
expression, 1 
grammar, 4 
language, 2 
regular expression, 81, 83, 84 


step description of 
finite automaton, 67, 69, 74 
pushdown automaton, 57 
Turing machine, 34, 42, 48, 51 
stop property of 
Turing machine, 36 
substitution, 95 
of languages, 96 


theorem 
Myhill-Nerode, 90 
transition relation of 
finite automaton, 67, 70, 76 
pushdown automaton, 58 
Turing machine, 35, 50 
translation grammar, 25 
Turing machine, 32 
basic model, 32 
basic model with guard, 38 
cleaning procedure, 34 
computation, 36, 51 
configuration, 34, 42, 48 
degree of nondeterminism, 51 
equivalence, 37 


EUROPEAN UNION 
EUROPEAN 
SOCIAL FUND Index 109 


function computed, 36 step description, 34, 42, 48, 51 
halting accepting state, 37 stop property, 36 

halting state, 38 transition relation, 35, 50 
halting state with guard, 40 two-way infinite tape, 41 
input accepted, 36, 51 

language accepted, 36 uniqueness condition of 
multi-tape, 45 LL(1) grammar, 26 
multi-track tape, 40 unit production, 12 


nondeterministic, 49 useless symbol, 9 


